summaryrefslogtreecommitdiffstats
path: root/include/rdma
Commit message (Collapse)AuthorAgeFilesLines
...
* | RDMA/core: Add signature attrs element for ib_mr structureMax Gurtovoy2019-06-242-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | This element will describe the needed characteristics for the signature operation per signature enabled memory region (type IB_MR_TYPE_INTEGRITY). Also add meta_length attribute to ib_sig_attrs structure for saving the mapped metadata length (needed for the new API implementation). Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/core: Introduce ib_map_mr_sg_pi to map data/protection sgl'sMax Gurtovoy2019-06-241-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | This function will map the previously dma mapped SG lists for PI (protection information) and data to an appropriate memory region for future registration. The given MR must be allocated as IB_MR_TYPE_INTEGRITY. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/core: Introduce IB_MR_TYPE_INTEGRITY and ib_alloc_mr_integrity APIIsrael Rukshin2019-06-241-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | This is a preparation for signature verbs API re-design. In the new design a single MR with IB_MR_TYPE_INTEGRITY type will be used to perform the needed mapping for data integrity operations. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/core: Save the MR type in the ib_mr structureMax Gurtovoy2019-06-241-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a preparation for the signature verbs API change. This change is needed since the MR type will define, in the upcoming patches, the need for allocating internal resources in LLD for signature handover related operations. It will also help to make sure that signature related functions are called with an appropriate MR type and fail otherwise. Also introduce new mr types IB_MR_TYPE_USER, IB_MR_TYPE_DMA and IB_MR_TYPE_DM for correctness. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/core: Introduce new header file for signature operationsMax Gurtovoy2019-06-242-111/+121
| | | | | | | | | | | | | | | | | | | | | | Ease the exhausted ib_verbs.h file and make the code more readable. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | IB/{rdmavt, qib, hfi1}: Convert to new completion APIMike Marciniszyn2019-06-201-36/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert all completions to use the new completion routine that fixes a race between post send and completion where fields from a SWQE can be read after SWQE has been freed. This patch also addresses issues reported in https://marc.info/?l=linux-kernel&m=155656897409107&w=2. The reserved operation path has no need for any barrier. The barrier for the other path is addressed by the smp_load_acquire() barrier. Cc: Andrea Parri <andrea.parri@amarulasolutions.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/rdmavt: Add new completion inlineMike Marciniszyn2019-06-201-0/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is opencoded send completion logic all over all the drivers. We need to convert to this routine to enforce ordering issues for completions. This routine fixes an ordering issue where the read of the SWQE fields necessary for creating the completion can race with a post send if the post send catches a send queue at the edge of being full. Is is possible in that situation to read SWQE fields that are being written. This new routine insures that SWQE fields are read prior to advancing the index that post send uses to determine queue fullness. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA: Convert destroy_wq to be voidLeon Romanovsky2019-06-201-1/+1
| | | | | | | | | | | | | | | | | | All callers of destroy WQ are always success and there is no need to check their return value, so convert destroy_wq to be void. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA: Report available cdevs through RDMA_NLDEV_CMD_GET_CHARDEVJason Gunthorpe2019-06-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the struct ib_client for all modules exporting cdevs related to the ibdevice to also implement RDMA_NLDEV_CMD_GET_CHARDEV. All cdevs are now autoloadable and discoverable by userspace over netlink instead of relying on sysfs. uverbs also exposes the DRIVER_ID for drivers that are able to support driver id binding in rdma-core. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA: Add NLDEV_GET_CHARDEV to allow char dev discovery and autoloadJason Gunthorpe2019-06-182-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow userspace to issue a netlink query against the ib_device for something like "uverbs" and get back the char dev name, inode major/minor, and interface ABI information for "uverbs0". Since we are now in netlink this can also trigger a module autoload to make the uverbs device come into existence. Largely this will let us replace searching and reading inside sysfs to setup devices, and provides an alternative (using driver_id) to device name based provider binding for things like rxe. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA: Move rdma_node_type to uapi/Jason Gunthorpe2019-06-171-12/+1
| | | | | | | | | | | | | | | | This enum is exposed over the sysfs file 'node_type' and over netlink via RDMA_NLDEV_ATTR_DEV_NODE_TYPE, so declare it in the uapi headers. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA: Convert CQ allocations to be under core responsibilityLeon Romanovsky2019-06-111-3/+3
| | | | | | | | | | | | | | | | | | | | Ensure that CQ is allocated and freed by IB/core and not by drivers. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Tested-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA: Clean destroy CQ in drivers do not return errorsLeon Romanovsky2019-06-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Like all other destroy commands, .destroy_cq() call is not supposed to fail. In all flows, the attempt to return earlier caused to memory leaks. This patch converts .destroy_cq() to do not return any errors. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Gal Pressman <galpress@amazon.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA: Move owner into struct ib_device_opsJason Gunthorpe2019-06-101-1/+1
| | | | | | | | | | | | | | | | | | This more closely follows how other subsytems work, with owner being a member of the structure containing the function pointers. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA: Move uverbs_abi_ver into struct ib_device_opsJason Gunthorpe2019-06-101-1/+1
| | | | | | | | | | | | | | | | | | No reason for every driver to emit code to set this, just make it part of the driver's existing static const ops structure. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA: Move driver_id into struct ib_device_opsJason Gunthorpe2019-06-102-2/+3
| | | | | | | | | | | | | | | | | | No reason for every driver to emit code to set this, just make it part of the driver's existing static const ops structure. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/core: Make ib_destroy_cq() voidLeon Romanovsky2019-05-211-2/+2
| | | | | | | | | | | | | | | | Kernel destroy CQ flows can't fail and the returned value of ib_destroy_cq() is not interested in those flows. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/umem: Move page_shift from ib_umem to ib_odp_umemJason Gunthorpe2019-05-212-15/+24
|/ | | | | | | | | | | This value has always been set to PAGE_SHIFT in the core code, the only thing that does differently was the ODP path. Move the value into the ODP struct and still use it for ODP, but change all the non-ODP things to just use PAGE_SHIFT/PAGE_SIZE/PAGE_MASK directly. Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* RDMA: Add EFA related definitionsGal Pressman2019-05-061-1/+3
| | | | | | | | | | | | Add EFA driver ID to the IOCTL interface uapi. This patch also adds unspecified node/transport type that will be used by EFA (usnic is left unchanged as it's already part of our ABI). Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/umem: Remove hugetlb flagShiraz Saleem2019-05-061-1/+0
| | | | | | | | | The drivers i40iw and bnxt_re no longer dependent on the hugetlb flag. So remove this flag from ib_umem structure. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/verbs: Add a DMA iterator to return aligned contiguous memory blocksShiraz Saleem2019-05-061-0/+47
| | | | | | | | | This helper iterates over a DMA-mapped SGL and returns contiguous memory blocks aligned to a HW supported page size. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/umem: Add API to find best driver supported page size in an MRShiraz Saleem2019-05-062-0/+33
| | | | | | | | | | This helper iterates through the SG list to find the best page size to use from a bitmap of HW supported page sizes. Drivers that support multiple page sizes, but not mixed sizes in an MR can use this API. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Allow detaching gid attribute netdevice for RoCEParav Pandit2019-05-031-1/+1
| | | | | | | | | | | | | | | | When there is active traffic through a GID, a QP/AH holds reference to this GID entry. RoCE GID entry holds reference to its attached netdevice. Due to this when netdevice is deleted by admin user, its refcount is not dropped. Therefore, while deleting RoCE GID, wait for all GID attribute's netdev users to finish accessing netdev in rcu context. Once all users done accessing it, release the netdev refcount. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/cma: Use rdma_read_gid_attr_ndev_rcu to access netdevParav Pandit2019-05-031-0/+1
| | | | | | | | | | | | To access the netdevice of the GID attribute, use an existing API rdma_read_gid_attr_ndev_rcu(). This further reduces dependency on open access to netdevice of GID attribute. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Introduce and use GID attr helper to read RoCE L2 fieldsParav Pandit2019-05-031-0/+3
| | | | | | | | | | | | | | Instead of RoCE drivers figuring out vlan, smac fields while working on QP/AH, provide a helper routine to read the L2 fields such as vlan_id and source mac address. This moves logic from mlx5 driver to core for wider usage for RoCE ports. This is a preparation patch to allow detaching netdev in subsequent patch. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Get rid of iw_cm_verbsKamal Heib2019-05-032-29/+19
| | | | | | | | | | | | Integrate iw_cm_verbs data members into ib_device_ops and ib_device structs, this is done to achieve the following: 1) Avoid memory related bugs durring error unwind 2) Make the code more cleaner 3) Reduce code duplication Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Introduce RDMA subsystem ibdev_* print functionsGal Pressman2019-05-011-0/+30
| | | | | | | | | | | | | | | | | | | Similarly to dev/netdev/etc printk helpers, add standard printk helpers for the RDMA subsystem. Example output: efa 0000:00:06.0 efa_0: Hello World! efa_0: Hello World! (no parent device set) (NULL ib_device): Hello World! (ibdev is NULL) Cc: Jason Baron <jbaron@akamai.com> Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Suggested-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* Merge branch 'rdma_mmap' into rdma.git for-nextJason Gunthorpe2019-04-241-9/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jason Gunthorpe says: ==================== Upon review it turns out there are some long standing problems in BAR mapping area: * BAR pages intended for read-only can be switched to writable via mprotect. * Missing use of rdma_user_mmap_io for the mlx5 clock BAR page. * Disassociate causes SIGBUS when touching the pages. * CPU pages are being mapped through to the process via remap_pfn_range instead of the more appropriate vm_insert_page, causing weird behaviors during disassociation. This series adds the missing VM_* flag manipulation, adds faulting a zero page for disassociation and revises the CPU page mappings to use vm_insert_page. ==================== For dependencies this branch is based on for-rc from git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git * branch 'rdma_mmap': RDMA: Remove rdma_user_mmap_page RDMA/mlx5: Use get_zeroed_page() for clock_info RDMA/ucontext: Fix regression with disassociate RDMA/mlx5: Use rdma_user_map_io for mapping BAR pages RDMA/mlx5: Do not allow the user to write to the clock page Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA: Remove rdma_user_mmap_pageJason Gunthorpe2019-04-241-9/+0
| | | | | | | | | | | | | | | | | | | | | | Upon further research drivers that want this should simply call the core function vm_insert_page(). The VMA holds a reference on the page and it will be automatically freed when the last reference drops. No need for disassociate to sequence the cleanup. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | IB/{rdmavt, qib, hfi1}: Use new routine to release reference countsMike Marciniszyn2019-04-241-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | The reference count adjustments on reference count completion are open coded throughout. Add a routine to do all reference count adjustments and use. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | IB/rdmavt: Fix ab/ba include issuesMike Marciniszyn2019-04-242-76/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The currently include file ordering for rdmavt headers has an ab/ba include issue the precludes using inlines from rdma_vt.h in rdmavt_qp.h. At the heart of the issue is that rdma_vt.h includes rdmavt_qp.h. Fix the ordering issue by adjusting rdma_vt.h to not require rdmavt_qp.h and move qp related inlines to rdmavt_qp.h. Additionally, promote rvt_mmap_info to rdma_vt.h since it is shared by rdmavt_cq.h and rdmavt_qp.h. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | IB/{rdmavt, hfi1): Miscellaneous comment fixesKaike Wan2019-04-241-1/+0
| | | | | | | | | | | | | | | | | | This patch fixes miscellaneous comment errors. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA: Handle SRQ allocations by IB/coreLeon Romanovsky2019-04-081-4/+5
| | | | | | | | | | | | | | Convert SRQ allocation from drivers to be in the IB/core Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA: Handle AH allocations by IB/coreLeon Romanovsky2019-04-081-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | Simplify drivers by ensuring lifetime of ib_ah object. The changes in .create_ah() go hand in hand with relevant update in .destroy_ah(). We will use this opportunity and convert .destroy_ah() to don't fail, as it was suggested a long time ago, because there is nothing to do in case of failure during destroy. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/core: Support object allocation in atomic contextLeon Romanovsky2019-04-081-1/+6
| | | | | | | | | | | | | | | | AH objects are allocated in atomic context and those allocations should be done with GFP_ATOMIC. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | IB: When attrs.udata/ufile is available use that instead of uobjectJason Gunthorpe2019-04-081-1/+1
| | | | | | | | | | | | | | The ucontext and ufile should not be accessed via the uobject, all these cases have an attrs so use that instead. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/umem: Combine contiguous PAGE_SIZE regions in SGEsShiraz Saleem2019-04-082-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Combine contiguous regions of PAGE_SIZE pages into single scatter list entry while building the scatter table for a umem. This minimizes the number of the entries in the scatter list and reduces the DMA mapping overhead, particularly with the IOMMU. Set default max_seg_size in core for IB devices to 2G and do not combine if we exceed this limit. Also, purge npages in struct ib_umem as we now DMA map the umem SGL with sg_nents and npage computation is not needed. Drivers should now be using ib_umem_num_pages(), so fix the last stragglers. Move npages tracking to ib_umem_odp as ODP drivers still need it. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Tested-by: Gal Pressman <galpress@amazon.com> Tested-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | IB: Pass only ib_udata in function prototypesShamir Rabinovitch2019-04-011-4/+1
| | | | | | | | | | | | | | | | | | | | | | Now when ib_udata is passed to all the driver's object create/destroy APIs the ib_udata will carry the ib_ucontext for every user command. There is no need to also pass the ib_ucontext via the functions prototypes. Make ib_udata the only argument psssed. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | IB: Pass uverbs_attr_bundle down ib_x destroy pathShamir Rabinovitch2019-04-011-34/+194
| | | | | | | | | | | | | | | | | | | | The uverbs_attr_bundle with the ucontext is sent down to the drivers ib_x destroy path as ib_udata. The next patch will use the ib_udata to free the drivers destroy path from the dependency in 'uobject->context' as we already did for the create path. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | IB: Pass uverbs_attr_bundle down uobject destroy pathShamir Rabinovitch2019-04-012-8/+14
| | | | | | | | | | | | | | | | | | Pass uverbs_attr_bundle down the uobject destroy path. The next patch will use this to eliminate the dependecy of the drivers in ib_x->uobject pointers. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | IB: ucontext should be set properly for all cmd & ioctl pathsShamir Rabinovitch2019-04-012-20/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | the Attempt to use the below commit to initialize the ucontext for the uobject destroy path has shown that the below commit is incomplete. Parts were reverted and the ucontext set up in the uverbs_attr_bundle was moved to rdma_lookup_get_uobject which is called from the uobj_get_XXX macros and rdma_alloc_begin_uobject which is called when uobject is created. Fixes: 3d9dfd060391 ("IB/uverbs: Add ib_ucontext to uverbs_attr_bundle sent from ioctl and cmd flows") Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/core: Don't compare specific bit after boolean ANDLeon Romanovsky2019-03-291-2/+2
| | | | | | | | | | | | | | | | There is no need to perform extra comparison after boolean AND. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA: Check net namespace access for uverbs, umad, cma and nldevParav Pandit2019-03-281-0/+3
| | | | | | | | | | | | | | | | | | | | | | Introduce an API rdma_dev_access_netns() to check whether a rdma device can be accessed from the specified net namespace or not. Use rdma_dev_access_netns() while opening character uverbs, umad network device and also check while rdma cm_id binds to rdma device. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/core: Implement compat device/sysfs tree in net namespaceParav Pandit2019-03-281-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement compatibility layer sysfs entries of ib_core so that non init_net net namespaces can also discover rdma devices. Each non init_net net namespace has ib_core_device created in it. Such ib_core_device sysfs tree resembles rdma devices found in init_net namespace. This allows discovering rdma devices in multiple non init_net net namespaces via sysfs entries and helpful to rdma-core userspace. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA/core: Introduce ib_core_device to hold deviceParav Pandit2019-03-281-6/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to support sysfs entries in multiple net namespaces for a rdma device, introduce a ib_core_device whose scope is limited to hold core device and per port sysfs related entries. This is preparation patch so that multiple ib_core_devices in each net namespace can be created in subsequent patch who all can share ib_device. (a) Move sysfs specific fields to ib_core_device. (b) Make sysfs and device life cycle related routines to work on ib_core_device. (c) Introduce and use rdma_init_coredev() helper to initialize coredev fields. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | RDMA: Use __packed annotation instead of __attribute__ ((packed))Erez Alfasi2019-03-254-6/+6
|/ | | | | | | | | | | | | | | "__attribute__" set of macros has been standardized, have became more potentially portable and consistent code back in v2.6.21 by commit 82ddcb040 ("[PATCH] extend the set of "__attribute__" shortcut macros"). Moreover, nowadays checkpatch.pl warns about using __attribute__((packed)) instead of __packed. This patch converts all the "__attribute__ ((packed))" annotations to "__packed" within the RDMA subsystem. Signed-off-by: Erez Alfasi <ereza@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Handle ucontext allocations by IB/coreLeon Romanovsky2019-02-221-3/+4
| | | | | | | Following the PD conversion patch, do the same for ucontext allocations. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Add RDMA_NLDEV_CMD_NEWLINK/DELLINK supportSteve Wise2019-02-192-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | Add support for new LINK messages to allow adding and deleting rdma interfaces. This will be used initially for soft rdma drivers which instantiate device instances dynamically by the admin specifying a netdev device to use. The rdma_rxe module will be the first user of these messages. The design is modeled after RTNL_NEWLINK/DELLINK: rdma drivers register with the rdma core if they provide link add/delete functions. Each driver registers with a unique "type" string, that is used to dispatch messages coming from user space. A new RDMA_NLDEV_ATTR is defined for the "type" string. User mode will pass 3 attributes in a NEWLINK message: RDMA_NLDEV_ATTR_DEV_NAME for the desired rdma device name to be created, RDMA_NLDEV_ATTR_LINK_TYPE for the "type" of link being added, and RDMA_NLDEV_ATTR_NDEV_NAME for the net_device interface to use for this link. The DELLINK message will contain the RDMA_NLDEV_ATTR_DEV_INDEX of the device to delete. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/rxe: Close a race after ib_register_deviceJason Gunthorpe2019-02-191-0/+5
| | | | | | | | | | | | | Since rxe allows unregistration from other threads the rxe pointer can become invalid any moment after ib_register_driver returns. This could cause a user triggered use after free. Add another driver callback to be called right after the device becomes registered to complete any device setup required post-registration. This callback has enough core locking to prevent the device from becoming unregistered. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/device: Provide APIs from the core code to help unregistrationJason Gunthorpe2019-02-191-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | These APIs are intended to support drivers that exist outside the usual driver core probe()/remove() callbacks. Normally the driver core will prevent remove() from running concurrently with probe(), once this safety is lost drivers need more support to get the locking and lifetimes right. ib_unregister_driver() is intended to be used during module_exit of a driver using these APIs. It unregisters all the associated ib_devices. ib_unregister_device_and_put() is to be used by a driver-specific removal function (ie removal by name, removal from a netdev notifier, removal from netlink) ib_unregister_queued() is to be used from netdev notifier chains where RTNL is held. The locking is tricky here since once things become async it is possible to race unregister with registration. This is largely solved by relying on the registration refcount, unregistration will only ever work on something that has a positive registration refcount - and then an unregistration mutex serializes all competing unregistrations of the same device. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>