summaryrefslogtreecommitdiffstats
path: root/include/rdma
Commit message (Collapse)AuthorAgeFilesLines
* IB/core: uverbs copy to struct or zero helperMichael Guralnik2018-12-201-0/+8
| | | | | | | | | | | | | | | | | | | | Add a helper to zero fill fields before copying data to UVERBS_ATTR_STRUCT. As UVERBS_ATTR_STRUCT can be used as an extensible struct, we want to make sure that if the user supplies us with a struct that has new fields that we are not aware of, we return them zeroed to the user. This helper should be used when using UVERBS_ATTR_STRUCT for an extendable data structure and there is a need to make sure that extended members of the struct, that the kernel doesn't handle, are returned zeroed to the user. This is needed due to the fact that UVERBS_ATTR_STRUCT allows non-zero values for members after 'last' member. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Mark if destroy address handle is in a sleepable contextGal Pressman2018-12-191-2/+8
| | | | | | | | | | | | Introduce a 'flags' field to destroy address handle callback and add a flag that marks whether the callback is executed in an atomic context or not. This will allow drivers to wait for completion instead of polling for it when it is allowed. Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Mark if create address handle is in a sleepable contextGal Pressman2018-12-191-2/+9
| | | | | | | | | | | Introduce a 'flags' field to create address handle callback and add a flag that marks whether the callback is executed in an atomic context or not. This will allow drivers to wait for completion instead of polling for it when it is allowed. Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/restrack: Resource-tracker should not use uobject pointersShamir Rabinovitch2018-12-181-6/+7
| | | | | | | | | Having uobject pointer embedded in ib core objects is not aligned with a future shared ib_x model. The resource tracker only does this to keep track of user/kernel objects - track this directly instead. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/uverbs: Add support to advise_mrMoni Shoua2018-12-181-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add new ioctl method for the MR object - ADVISE_MR. This command can be used by users to give an advice or directions to the kernel about an address range that belongs to memory regions. A new ib_device callback, advise_mr(), is introduced here to suupport the new command. This command takes the following arguments: - pd: The protection domain to which all memory regions belong - advice: The type of the advice * IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH - Pre-fetch a range of an on-demand paging MR * IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_WRITE - Pre-fetch a range of an on-demand paging MR with write intention - flags: The properties of the advice * IB_UVERBS_ADVISE_MR_FLAG_FLUSH - Operation must end before return to the caller - sg_list: The list of memory ranges - num_sge: The number of memory ranges in the list - attrs: More attributes to be parsed by the provider Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/uverbs: Add helper to get array size from ptr attributeMoni Shoua2018-12-181-0/+22
| | | | | | | | | | | | When the parser of an ioctl command has the knowledge that a ptr attribute in a bundle represents an array of structures, it is useful for it to know the number of elements in the array. This is done by dividing the attribute length with the element size. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Start use ib_device_opsKamal Heib2018-12-122-296/+19
| | | | | | | Make all the required change to start use the ib_device_ops structure. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Introduce ib_device_opsKamal Heib2018-12-111-0/+242
| | | | | | | | | | | | | | This change introduces the ib_device_ops structure that defines all the InfiniBand device operations in one place, so the code will be more readable and clean, unlike today when the ops are mixed with ib_device data members. The providers will need to define the supported operations and assign them using ib_set_device_ops(), that will also make the providers code more readable and clean. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/mlx5: Fail early if user tries to create flows on IB representorsLeon Romanovsky2018-12-111-4/+5
| | | | | | | | | | IB representors don't support creation of RAW ethernet QP flows. Disable them by reusing existing RDMA/core support macros. We do it for both creation and matcher because latter is not usable if no flow creation is available. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/core: Add new IB ratesMichael Guralnik2018-12-111-1/+5
| | | | | | | | | | Add the new rates that were added to Infiniband spec as part of HDR and 2x support. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/core: Add 2X port widthMichael Guralnik2018-12-111-0/+2
| | | | | | | | | Add the new 2X port width that is part of IB spec 1.3 Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/core: Add CapabilityMask2 to port attributesMichael Guralnik2018-12-112-0/+11
| | | | | | | | | | | CapabilityMask2 was added in IB Spec 1.3 under PortInfo attribute. The new Capapbility mask is needed in order to expose the new 2X width and HDR speed. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/uverbs: Fix typo in string concatenation macroLeon Romanovsky2018-12-061-1/+1
| | | | | | | | Update UVERBS_OBJECT() macro to properly concatenate the object name. Fixes: e502a864c352 ("IB/core: Introduce DECLARE_UVERBS_GLOBAL_METHODS") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/hfi1: Allow the driver to initialize QP priv structMike Marciniszyn2018-12-061-0/+7
| | | | | | | | | | | | | This patch adds an interface to allow the driver to initialize the QP priv struct when the QP is created and after the qpn has been assigned. A field is added to the QP priv struct to reference the rcd and two new files are added to contain the function to initialize the rcd field so that more TID RDMA related code can be added here later. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/core: Enable getting an object type from a given uobjectYishai Hadas2018-12-041-0/+12
| | | | | | | | | Enable getting an object type from a given uobject, the type is saved upon tree merging and is returned as part of some helper function. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Introduce UVERBS_IDR_ANY_OBJECTYishai Hadas2018-12-041-0/+6
| | | | | | | | | | | | | | Introduce the UVERBS_IDR_ANY_OBJECT type to match any IDR object. Once used, the infrastructure skips checking for the IDR type, it becomes the driver handler responsibility. This enables drivers to get in a given method an object from various of types. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* RDMA/restrack: Track ucontextLeon Romanovsky2018-12-032-0/+8
| | | | | | | | Add ability to track allocated ib_ucontext, which are limited resource and worth to be visible by users. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* RDMA/uverbs: Use only attrs for the write() handler signatureJason Gunthorpe2018-12-031-6/+2
| | | | | | | | | | | All of the old arguments can be derived from the uverbs_attr_bundle structure, so get rid of the redundant arguments. Most of the prior work has been removing users of the arguments to allow this to be a simple patch. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* RDMA/uverbs: Use uverbs_attr_bundle to pass ucore for write/write_exJason Gunthorpe2018-12-031-0/+1
| | | | | | | | | | | | This creates a consistent way to access the two core buffers across write and write_ex handlers. Remove the open coded ucore conversion in the write/ex compatibility handlers. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* RDMA/uverbs: Use uverbs_attr_bundle to pass udata for ioctl()Jason Gunthorpe2018-11-261-2/+6
| | | | | | | | | | | Have the core code initialize the driver_udata if the method has a udata description. This is done using the same create_udata the handler was supposed to call. This makes ioctl consistent with the write and write_ex paths. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* RDMA/uverbs: Use uverbs_attr_bundle to pass udata for write_exJason Gunthorpe2018-11-261-2/+2
| | | | | | | | | | | | | The core code needs to compute the udata so we may as well pass it in the uverbs_attr_bundle instead of on the stack. This converts the simple case of write_ex() which already has a core calculation. Also change the write() path to use the attrs for ib_uverbs_init_udata() instead of on the stack. This lets the write to write_ex compatibility path continue to follow the lead of the _ex path. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* RDMA/uverbs: Add structure size info to write commandsJason Gunthorpe2018-11-261-3/+9
| | | | | | | | | | | We need the structure sizes to compute the location of the udata in the core code. Annotate the sizes into the new macro language. This is generated largely by script and checked by comparing against the similar list in rdma-core. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* RDMA/uverbs: Do not pass ib_uverbs_file to ioctl methodsJason Gunthorpe2018-11-262-4/+3
| | | | | | | | The uverbs_attr_bundle already contains this pointer, and most methods don't actually need it. Get rid of the redundant function argument. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* RDMA/uverbs: Make write() handlers return 0 on successJason Gunthorpe2018-11-262-10/+8
| | | | | | | | Currently they return the command length, while all other handlers return 0. This makes the write path closer to the write_ex and ioctl path. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* RDMA/uverbs: Replace ib_uverbs_file with uverbs_attr_bundle for writeJason Gunthorpe2018-11-263-26/+34
| | | | | | | | | | | | | | | | Now that we can add meta-data to the description of write() methods we need to pass the uverbs_attr_bundle into all write based handlers so future patches can use it as a container for any new data transferred out of the core. This is the first step to bringing the write() and ioctl() methods to a common interface signature. This is a simple search/replace, and we push the attr down into the uobj and other APIs to keep changes minimal. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* RDMA/core: Sync unregistration with netlink commandsParav Pandit2018-11-221-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the rdma device is getting removed, get resource info can race with device removal, as below: CPU-0 CPU-1 -------- -------- rdma_nl_rcv_msg() nldev_res_get_cq_dumpit() mutex_lock(device_lock); get device reference mutex_unlock(device_lock); [..] ib_unregister_device() /* Valid reference to * device->dev exists. */ ib_dealloc_device() [..] provider->fill_res_entry(); Even though device object is not freed, fill_res_entry() can get called on device which doesn't have a driver anymore. Kernel core device reference count is not sufficient, as this only keeps the structure valid, and doesn't guarantee the driver is still loaded. Similar race can occur with device renaming and device removal, where device_rename() tries to rename a unregistered device. While this is fine for devices of a class which are not net namespace aware, but it is incorrect for net namespace aware class coming in subsequent series. If a class is net namespace aware, then the below [1] call trace is observed in above situation. Therefore, to avoid the race, keep a reference count and let device unregistration wait until all netlink users drop the reference. [1] Call trace: kernfs: ns required in 'infiniband' for 'mlx5_0' WARNING: CPU: 18 PID: 44270 at fs/kernfs/dir.c:842 kernfs_find_ns+0x104/0x120 libahci i2c_core mlxfw libata dca [last unloaded: devlink] RIP: 0010:kernfs_find_ns+0x104/0x120 Call Trace: kernfs_find_and_get_ns+0x2e/0x50 sysfs_rename_link_ns+0x40/0xb0 device_rename+0xb2/0xf0 ib_device_rename+0xb3/0x100 [ib_core] nldev_set_doit+0x165/0x190 [ib_core] rdma_nl_rcv_msg+0x249/0x250 [ib_core] ? netlink_deliver_tap+0x8f/0x3e0 rdma_nl_rcv+0xd6/0x120 [ib_core] netlink_unicast+0x17c/0x230 netlink_sendmsg+0x2f0/0x3e0 sock_sendmsg+0x30/0x40 __sys_sendto+0xdc/0x160 Fixes: da5c85078215 ("RDMA/nldev: add driver-specific resource tracking") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/uverbs: Check for NULL driver methods for every write callJason Gunthorpe2018-11-221-0/+16
| | | | | | | | | | | | | | Add annotations to the uverbs_api structure indicating which driver methods are called by the implementation. If the required method is NULL the write API will be not be callable. This effectively duplicates the cmd_mask system, however it does it by expressing invariants required by the core code, not by delegating decision making to the driver. This is another step toward eliminating cmd_mask. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* RDMA/verbs: Store the write/write_ex uapi entry points in the uverbs_apiJason Gunthorpe2018-11-221-6/+88
| | | | | | | | | | | Bringing all uapi entry points into one place lets us deal with them consistently. For instance the write, write_ex and ioctl paths can be disabled when an API is not supported by the driver. This will replace the uverbs_cmd_table static arrays. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* RDMA/uverbs: Add helpers to mark uapi functions as unsupportedJason Gunthorpe2018-11-221-0/+31
| | | | | | | | | | | | | | | | | We have many cases where parts of the uapi are not supported in a driver, needs a certain protocol, or whatever. It is best to reflect this directly into the struct uverbs_api when it is built so that everything is simply blocked off, and future introspection can report a proper supported list. This is done by adding some additional helpers to the definition list language that disable objects based on a 'supported' call back, and a helper that disables based on a NULL struct ib_device function pointer. Disablement is global. For instance, if a driver disables an object then everything connected to that object is removed, including core methods. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* RDMA/uverbs: Use a linear list to describe the compiled-in uapiJason Gunthorpe2018-11-224-37/+38
| | | | | | | | | | | | | | | | | | The 'tree' data structure is very hard to build at compile time, and this makes it very limited. The new radix tree based compiler can handle a more complex input language that does not require the compiler to perfectly group everything into a neat tree structure. Instead use a simple list to describe to input, where the list elements can be of various different 'opcodes' instructing the radix compiler what to do. Start out with opcodes chaining to other definition lists and chaining to the existing 'tree' definition. Replace the very top level of the 'object tree' with this list type and get rid of struct uverbs_object_tree_def and DECLARE_UVERBS_OBJECT_TREE. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* IB/core: Make function ib_fmr_pool_unmap return voidYuval Shaia2018-11-211-1/+1
| | | | | | | | | | Since the function always returns 0 make it void. Reported-by: HÃ¥kon Bugge <haakon.bugge@oracle.com> Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Remove unused header files mm.h, socket.h, scatterlist.hParav Pandit2018-11-211-3/+0
| | | | | | | | | Structures of ib_verbs.h don't use fields/structures of mm.h, socket.h or scatterlist.h. So remove such header files inclusion. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/uverbs: fix a typoRami Rosen2018-11-061-1/+1
| | | | | | | | This patch fixes a typo in include/rdma/ib_verbs.h. See: https://www.merriam-webster.com/dictionary/lieu Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2018-10-2613-146/+385
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull rdma updates from Jason Gunthorpe: "This has been a smaller cycle with many of the commits being smallish code fixes and improvements across the drivers. - Driver updates for bnxt_re, cxgb4, hfi1, hns, mlx5, nes, qedr, and rxe - Memory window support in hns - mlx5 user API 'flow mutate/steering' allows accessing the full packet mangling and matching machinery from user space - Support inter-working with verbs API calls in the 'devx' mlx5 user API, and provide options to use devx with less privilege - Modernize the use of syfs and the device interface to use attribute groups and cdev properly for uverbs, and clean up some of the core code's device list management - More progress on net namespaces for RDMA devices - Consolidate driver BAR mmapping support into core code helpers and rework how RDMA holds poitners to mm_struct for get_user_pages cases - First pass to use 'dev_name' instead of ib_device->name - Device renaming for RDMA devices" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (242 commits) IB/mlx5: Add support for extended atomic operations RDMA/core: Fix comment for hw stats init for port == 0 RDMA/core: Refactor ib_register_device() function RDMA/core: Fix unwinding flow in case of error to register device ib_srp: Remove WARN_ON in srp_terminate_io() IB/mlx5: Allow scatter to CQE without global signaled WRs IB/mlx5: Verify that driver supports user flags IB/mlx5: Support scatter to CQE for DC transport type RDMA/drivers: Use core provided API for registering device attributes RDMA/core: Allow existing drivers to set one sysfs group per device IB/rxe: Remove unnecessary enum values RDMA/umad: Use kernel API to allocate umad indexes RDMA/uverbs: Use kernel API to allocate uverbs indexes RDMA/core: Increase total number of RDMA ports across all devices IB/mlx4: Add port and TID to MAD debug print IB/mlx4: Enable debug print of SMPs RDMA/core: Rename ports_parent to ports_kobj RDMA/core: Do not expose unsupported counters IB/mlx4: Refer to the device kobject instead of ports_parent RDMA/nldev: Allow IB device rename through RDMA netlink ...
| * RDMA/core: Allow existing drivers to set one sysfs group per deviceParav Pandit2018-10-171-2/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently many rdma drivers are creating device attribute files using device_create_file() with device specific attributes. Device specific attributes should be exposed via well defined netlink device attributes in future. Introduce an API rdma_set_device_sysfs_group() for existing drivers to set a group for sysfs attributes for legacy. This API is only for exposing legacy attributes which existed for sometime now. New drivers should not be using this API and rather follow netlink path. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/core: Rename ports_parent to ports_kobjParav Pandit2018-10-161-1/+1
| | | | | | | | | | | | | | | | | | Normally kobj objects have kobj suffix to reflect it. Rename ports_parent to ports_kobj. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * RDMA/core: Annotate timeout as unsigned longLeon Romanovsky2018-10-164-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ucma users supply timeout in u32 format, it means that any number with most significant bit set will be converted to negative value by various rdma_*, cma_* and sa_query functions, which treat timeout as int. In the lowest level, the timeout is converted back to be unsigned long. Remove this ambiguous conversion by updating all function signatures to receive unsigned long. Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * RDMA/core: Align multiple functions to kernel coding styleLeon Romanovsky2018-10-162-23/+16
| | | | | | | | | | | | | | | | | | This patch changes the small number of functions to be aligned to kernel coding style. It is needed to minimize the diffstat of the following patch. It doesn't change any functionality. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * RDMA/restrack: Consolidate task name updates in one placeLeon Romanovsky2018-10-051-2/+2
| | | | | | | | | | | | | | | | | | | | Unify task update and kernel name set in one place. Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Yossi Itigin <yosefe@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/restrack: Un-inline set task implementationLeon Romanovsky2018-10-051-8/+2
| | | | | | | | | | | | | | | | | | | | | | Prepare rdma_restrack_set_task() call to accommodate more code by moving its implementation from *.h to *.c. Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Yossi Itigin <yosefe@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * IB/{hfi1, qib, rdmavt}: Move ruc_loopback to rdmavtVenkata Sandeep Dhanalakota2018-10-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves ruc_loopback() from hfi1 into rdmavt for code sharing with the qib driver. Reviewed-by: Brian Welty <brian.welty@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * IB/{hfi1, qib, rdmavt}: Move send completion logic to rdmavtVenkata Sandeep Dhanalakota2018-10-032-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | Moving send completion code into rdmavt in order to have shared logic between qib and hfi1 drivers. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * IB/{hfi1, qib, rdmavt}: Move copy SGE logic into rdmavtBrian Welty2018-10-032-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves hfi1_copy_sge() into rdmavt for sharing with qib. This patch also moves all the wss_*() functions into rdmavt as several wss_*() functions are called from hfi1_copy_sge() When SGE copy mode is adaptive, cacheless copy may be done in some cases for performance reasons. In those cases, X86 cacheless copy function is called since the drivers that use rdmavt and may set SGE copy mode to adaptive are X86 only. For this reason, this patch adds "depends on X86_64" to rdmavt/Kconfig. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * IB/mlx4: Avoid implicit enumerated type conversionNathan Chancellor2018-10-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clang warns when one enumerated type is implicitly converted to another. drivers/infiniband/hw/mlx4/mad.c:1811:41: warning: implicit conversion from enumeration type 'enum mlx4_ib_qp_flags' to different enumeration type 'enum ib_qp_create_flags' [-Wenum-conversion] qp_init_attr.init_attr.create_flags = MLX4_IB_SRIOV_TUNNEL_QP; ~ ^~~~~~~~~~~~~~~~~~~~~~~ drivers/infiniband/hw/mlx4/mad.c:1819:41: warning: implicit conversion from enumeration type 'enum mlx4_ib_qp_flags' to different enumeration type 'enum ib_qp_create_flags' [-Wenum-conversion] qp_init_attr.init_attr.create_flags = MLX4_IB_SRIOV_SQP; ~ ^~~~~~~~~~~~~~~~~ The type mlx4_ib_qp_flags explicitly provides supplemental values to the type ib_qp_create_flags. Make that clear to Clang by changing the create_flags type to u32. Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/netlink: Simplify netlink listener existence checkLeon Romanovsky2018-10-031-2/+2
| | | | | | | | | | | | | | | | | | All users of rdma_nl_chk_listeners() are interested to get boolean answer if netlink socket has listeners, so update all places to boolean function. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA: Remove unused parameter from ib_modify_qp_is_ok()Kamal Heib2018-10-031-3/+1
| | | | | | | | | | | | | | | | The ll parameter is not used in ib_modify_qp_is_ok(), so remove it. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * IB/hfi1: Prepare resource waits for dual legDennis Dalessandro2018-09-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current implementation allows each qp to have only one send engine. As such, each qp has only one list to queue prebuilt packets when send engine resources are not available. To improve performance, it is desired to support multiple send engines for each qp. This patch creates the framework to support two send engines (two legs) for each qp for the TID RDMA protocol, which can be easily extended to support more send engines. It achieves the goal by creating a leg specific struct, iowait_work in the iowait struct, to hold the work_struct and the tx_list as well as a pointer to the parent iowait struct. The hfi1_pkt_state now has an additional field to record the current legs work structure and that is now passed to all egress waiters to determine the leg that needs to wait via a new iowait helper. The APIs are adjusted to use the new leg specific struct as required. Many new and modified helpers are added to support this change. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * IB/rdmavt: Rename check_send_wqe as setup_wqeKaike Wan2018-09-301-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | The driver-provided function check_send_wqe allows the hardware driver to check and set up the incoming send wqe before it is inserted into the swqe ring. This patch will rename it as setup_wqe to better reflect its usage. In addition, this function is only called when all setup is complete in rdmavt. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * RDMA/ulp: Use dev_name instead of ibdev->nameJason Gunthorpe2018-09-261-1/+1
| | | | | | | | | | | | | | | | | | | | These return the same thing but dev_name is a more conventional use of the kernel API. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
| * RDMA: Fully setup the device name in ib_register_deviceJason Gunthorpe2018-09-262-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current code has two copies of the device name, ibdev->dev and dev_name(&ibdev->dev), and they are setup at different times, which is very confusing. Set them both up at the same time and make dev_name() the lead name, which is the proper use of the driver core APIs. To make it very clear that the name is not valid until registration pass it in to the ib_register_device() call rather than messing with ibdev->name directly. Also the reorganization now checks that dev_name is unique even if it does not contain a %. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Devesh Sharma <devesh.sharma@broadcom.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>