summaryrefslogtreecommitdiffstats
path: root/include/linux/mlx5
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2021-05-012-2/+42
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull rdma updates from Jason Gunthorpe: "This is significantly bug fixes and general cleanups. The noteworthy new features are fairly small: - XRC support for HNS and improves RQ operations - Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw and qib - Quite a few general cleanups on spelling, error handling, static checker detections, etc - Increase the number of device ports supported beyond 255. High port count software switches now exist - Several bug fixes for rtrs - mlx5 Device Memory support for host controlled atomics - Report SRQ tables through to rdma-tool" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (145 commits) IB/qib: Remove redundant assignment to ret RDMA/nldev: Add copy-on-fork attribute to get sys command RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res RDMA/siw: Fix a use after free in siw_alloc_mr IB/hfi1: Remove redundant variable rcd RDMA/nldev: Add QP numbers to SRQ information RDMA/nldev: Return SRQ information RDMA/restrack: Add support to get resource tracking for SRQ RDMA/nldev: Return context information RDMA/core: Add CM to restrack after successful attachment to a device RDMA/cma: Skip device which doesn't support CM RDMA/rxe: Fix a bug in rxe_fill_ip_info() RDMA/mlx5: Expose private query port RDMA/mlx4: Remove an unused variable RDMA/mlx5: Fix type assignment for ICM DM IB/mlx5: Set right RoCE l3 type and roce version while deleting GID RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails RDMA/cxgb4: add missing qpid increment IB/ipoib: Remove unnecessary struct declaration RDMA/bnxt_re: Get rid of custom module reference counting ...
| * Merge branch 'mlx5_memic_ops' of ↵Jason Gunthorpe2021-04-131-1/+41
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Maor Gottlieb says: ==================== This series from Maor extends MEMIC to support atomic operations from the host in addition to already supported regular read/write. ==================== * 'memic_ops': RDMA/mlx5: Expose UAPI to query DM RDMA/mlx5: Add support in MEMIC operations RDMA/mlx5: Add support to MODIFY_MEMIC command RDMA/mlx5: Re-organize the DM code RDMA/mlx5: Move all DM logic to separate file RDMA/uverbs: Make UVERBS_OBJECT_METHODS to consider line number net/mlx5: Add MEMIC operations related bits
| | * net/mlx5: Add MEMIC operations related bitsMaor Gottlieb2021-04-131-1/+41
| | | | | | | | | | | | | | | | | | | | | Add the MEMIC operations bits and structures to the mlx5_ifc file. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
| * | Merge branch 'mlx5-next' of ↵Jason Gunthorpe2021-04-123-10/+33
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== This pr contains changes from mlx5-next branch, already reviewed on netdev and rdma mailing lists, links below. 1) From Leon, Dynamically assign MSI-X vectors count Already Acked by Bjorn Helgaas. https://patchwork.kernel.org/project/netdevbpf/cover/20210314124256.70253-1-leon@kernel.org/ 2) Cleanup series: https://patchwork.kernel.org/project/netdevbpf/cover/20210311070915.321814-1-saeed@kernel.org/ From Mark, E-Switch cleanups and refactoring, and the addition of single FDB mode needed HW bits. From Mikhael, Remove unused struct field From Saeed, Cleanup W=1 prototype warning From Zheng, Esw related cleanup From Tariq, User order-0 page allocation for EQs ==================== * mlx5-next: net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks net/mlx5: Dynamically assign MSI-X vectors count net/mlx5: Add dynamic MSI-X capabilities bits PCI/IOV: Add sysfs MSI-X vector assignment interface net/mlx5: Use order-0 allocations for EQs net/mlx5: Add IFC bits needed for single FDB mode net/mlx5: E-Switch, Refactor send to vport to be more generic RDMA/mlx5: Use representor E-Switch when getting netdev and metadata net/mlx5: E-Switch, Add eswitch pointer to each representor net/mlx5: E-Switch, Add match on vhca id to default send rules net/mlx5: Remove unused mlx5_core_health member recover_work net/mlx5: simplify the return expression of mlx5_esw_offloads_pair() net/mlx5: Cleanup prototype warning Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | RDMA/mlx5: Fix query RoCE portMaor Gottlieb2021-03-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mlx5_is_roce_enabled returns the devlink RoCE init value, therefore it should be used only when driver is loaded. Instead we just need to read the roce_en field. In addition, rename mlx5_is_roce_enabled to mlx5_is_roce_init_enabled. Fixes: 7a58779edd75 ("IB/mlx5: Improve query port for representor port") Link: https://lore.kernel.org/r/20210304124517.1100608-2-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | | net/mlx5: E-Switch, Prepare to return total vports from eswitch structParav Pandit2021-04-241-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Total vports are already stored during eswitch initialization. Instead of calculating everytime, read directly from eswitch. Additionally, host PF's SF vport information is available using QUERY_HCA_CAP command. It is not available through HCA_CAP of the eswitch manager PF. Hence, this patch prepares the return total eswitch vport count from the existing eswitch struct. This further helps to keep eswitch port counting macros and logic within eswitch. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | net/mlx5: E-Switch, Return eswitch max ports when eswitch is supportedParav Pandit2021-04-241-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mlx5_eswitch_get_total_vports() doesn't honor MLX5_ESWICH Kconfig flag. When MLX5_ESWITCH is disabled, FS layer continues to initialize eswitch specific ACL namespaces. Instead, start honoring MLX5_ESWITCH flag and perform vport specific initialization only when vport count is non zero. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vu Pham <vuhuong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | net/mlx5: DR, Add support for isolate_vl_tc QPYevgeny Kliteynik2021-04-191-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using SW steering, rule insertion rate depends on the RDMA RC QP performance used for writing to the ICM. During stress this QP is competing on the HW resources with all the other QPs that are used to send data. To protect SW steering QP's performance in such cases, we set this QP to use isolated VL. The VL number is reserved by FW and is not exposed to the driver. Support for this QP on isolated VL exists only when both force-loopback and isolate_vl_tc capabilities are set. Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | net/mlx5: DR, Add support for force-loopback QPYevgeny Kliteynik2021-04-191-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When supported by the device, SW steering RoCE RC QP that is used to write/read to/from ICM will be created with force-loopback attribute. Such QP doesn't require GID index upon creation. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | net/mlx5: mlx5_ifc updates for flex parserYevgeny Kliteynik2021-04-191-4/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | Added the required definitions for supporting more protocols by flex parsers (GTP-U, Geneve TLV options), and for using the right flex parser that was configured for this protocol. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | net/mlx5e: RX, Add checks for calculated Striding RQ attributesTariq Toukan2021-04-191-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Striding RQ attributes below are mutually dependent. An unaware change to one might take the others out of the valid range derived by the HW caps: - The MPWQE size in bytes - The number of strides in a MPWQE - The stride size Add checks to verify they are valid and comply to the HW spec and SW assumptions/requirements. This is not a fix, no particular issue exists today. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | net/mlx5: Add register layout to support extended link stateMoshe Tal2021-04-162-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add needed structure layouts and defines for pddr register (Port Diagnostics Database Register) and the troublshooting page. This will be used to get extended link state from the monitor opcode bits. Signed-off-by: Moshe Tal <moshet@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | net/mlx5: E-Switch, Make vport number u16Parav Pandit2021-04-141-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vport number is 16-bit field in hardware. Make it u16. Move location of vport in the structure so that it reduces a hole in the structure. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | net/mlx5: Add support for DSFP module EEPROM dumpsVladyslav Tarasiuk2021-04-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Allow the driver to recognise DSFP transceiver module ID and therefore allow its EEPROM dumps using ethtool. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net/mlx5: Implement get_module_eeprom_by_page()Vladyslav Tarasiuk2021-04-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | Implement ethtool_ops::get_module_eeprom_by_page() to enable support of new SFP standards. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net/mlx5: Refactor module EEPROM queryVladyslav Tarasiuk2021-04-111-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare for ethtool_ops::get_module_eeprom_data() implementation by extracting common part of mlx5_query_module_eeprom() into a separate function. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2021-04-091-4/+6
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: MAINTAINERS - keep Chandrasekar drivers/net/ethernet/mellanox/mlx5/core/en_main.c - simple fix + trust the code re-added to param.c in -next is fine include/linux/bpf.h - trivial include/linux/ethtool.h - trivial, fix kdoc while at it include/linux/skmsg.h - move to relevant place in tcp.c, comment re-wrapped net/core/skmsg.c - add the sk = sk // sk = NULL around calls net/tipc/crypto.c - trivial Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | net/mlx5: Fix PBMC register mappingAya Levin2021-04-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add reserved mapping to cover all the register in order to avoid setting arbitrary values to newer FW which implements the reserved fields. Fixes: 50b4a3c23646 ("net/mlx5: PPTB and PBMC register firmware command support") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| * | | net/mlx5: Fix PPLM register mappingAya Levin2021-04-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add reserved mapping to cover all the register in order to avoid setting arbitrary values to newer FW which implements the reserved fields. Fixes: a58837f52d43 ("net/mlx5e: Expose FEC feilds and related capability bit") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| * | | net/mlx5: Fix placement of log_max_flow_counterRaed Salem2021-04-061-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cited commit wrongly placed log_max_flow_counter field of mlx5_ifc_flow_table_prop_layout_bits, align it to the HW spec intended placement. Fixes: 16f1c5bb3ed7 ("net/mlx5: Check device capability for maximum flow counters") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | | Merge branch 'mlx5-next' of ↵Jakub Kicinski2021-04-093-10/+33
|\ \ \ \ | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-next 2021-04-09 This pr contains changes from mlx5-next branch, already reviewed on netdev and rdma mailing lists, links below. 1) From Leon, Dynamically assign MSI-X vectors count Already Acked by Bjorn Helgaas. https://patchwork.kernel.org/project/netdevbpf/cover/20210314124256.70253-1-leon@kernel.org/ 2) Cleanup series: https://patchwork.kernel.org/project/netdevbpf/cover/20210311070915.321814-1-saeed@kernel.org/ From Mark, E-Switch cleanups and refactoring, and the addition of single FDB mode needed HW bits. From Mikhael, Remove unused struct field From Saeed, Cleanup W=1 prototype warning From Zheng, Esw related cleanup From Tariq, User order-0 page allocation for EQs * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks net/mlx5: Dynamically assign MSI-X vectors count net/mlx5: Add dynamic MSI-X capabilities bits PCI/IOV: Add sysfs MSI-X vector assignment interface net/mlx5: Use order-0 allocations for EQs net/mlx5: Add IFC bits needed for single FDB mode net/mlx5: E-Switch, Refactor send to vport to be more generic RDMA/mlx5: Use representor E-Switch when getting netdev and metadata net/mlx5: E-Switch, Add eswitch pointer to each representor net/mlx5: E-Switch, Add match on vhca id to default send rules net/mlx5: Remove unused mlx5_core_health member recover_work net/mlx5: simplify the return expression of mlx5_esw_offloads_pair() net/mlx5: Cleanup prototype warning ==================== Link: https://lore.kernel.org/r/20210409200704.10886-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | net/mlx5: Add dynamic MSI-X capabilities bitsLeon Romanovsky2021-04-041-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These new fields declare the number of MSI-X vectors that is possible to allocate on the VF through PF configuration. Value must be in range defined by min_dynamic_vf_msix_table_size and max_dynamic_vf_msix_table_size. The driver should continue to query its MSI-X table through PCI configuration header. Link: https://lore.kernel.org/linux-pci/20210314124256.70253-3-leon@kernel.org Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
| * | | net/mlx5: Use order-0 allocations for EQsTariq Toukan2021-03-121-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we are allocating high-order page for EQs. In case of fragmented system, VF hot remove/add in VMs for example, there isn't enough contiguous memory for EQs allocation, which results in crashing of the VM. Therefore, use order-0 fragments for the EQ allocations instead. Performance tests: ConnectX-5 100Gbps, CPU: Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz Performance tests show no sensible degradation. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| * | | net/mlx5: Add IFC bits needed for single FDB modeMark Bloch2021-03-121-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we operate in a mode where each eswitch manager has a separate FDB. In order to combine these multiple FDBs we expose new caps to allow this: - Set root flow table which isn't native. - Set FDB a different selection mode when in LAG mode. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| * | | net/mlx5: E-Switch, Refactor send to vport to be more genericMark Bloch2021-03-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that each representor stores a pointer to the managing E-Switch use that information when creating the send-to-vport rules. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| * | | net/mlx5: E-Switch, Add eswitch pointer to each representorMark Bloch2021-03-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Store the managing E-Switch of each representor. This will be used when a representor is created on eswitch manager 0 but the vport belongs to eswitch manager 1. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| * | | net/mlx5: Remove unused mlx5_core_health member recover_workMikhael Goikhman2021-03-121-1/+0
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | The code related to health->recover_work was removed in commit 63cbc552eebf ("net/mlx5: Handle SW reset of FW in error flow") Fix struct mlx5_core_health accordingly. Signed-off-by: Mikhael Goikhman <migo@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | net/mlx5: Map register values to restore objectsChris Mi2021-04-061-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently reg_c0 lower 16 bits and reg_b are used to store the chain id that missed in FDB and NIC tables accordingly. However, the registers' values may index a restore object, rather than a single u32 value. Different object types can be used to restore mutually exclusive contexts such as chain id and sample group id. Use the mapping object to associate an index with a restore object as a prestep for supporting additional restore types. Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | net/mlx5: Allocate rate limit table when rate is configuredParav Pandit2021-04-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A device supports 128 rate limiters. A static table allocation consumes 8KB of memory even when rate is not configured. Instead, allocate the table when at least one rate is configured. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | net/mlx5: Pack mlx5_rl_entry structureParav Pandit2021-04-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mlx5_rl_entry structure is not properly packed as shown below. Due to this an array of size 9144 bytes allocated which is aligned to 16Kbytes. Hence, pack the structure and avoid the wastage. This offers 8Kbytes of saving per mlx5_core_dev struct. pahole -C mlx5_rl_entry drivers/net/ethernet/mellanox/mlx5/core/en_main.o Existing layout: struct mlx5_rl_entry { u8 rl_raw[48]; /* 0 48 */ u16 index; /* 48 2 */ /* XXX 6 bytes hole, try to pack */ u64 refcount; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u16 uid; /* 64 2 */ u8 dedicated:1; /* 66: 0 1 */ /* size: 72, cachelines: 2, members: 5 */ /* sum members: 60, holes: 1, sum holes: 6 */ /* sum bitfield members: 1 bits (0 bytes) */ /* padding: 5 */ /* bit_padding: 7 bits */ /* last cacheline: 8 bytes */ }; After alignment: struct mlx5_rl_entry { u8 rl_raw[48]; /* 0 48 */ u64 refcount; /* 48 8 */ u16 index; /* 56 2 */ u16 uid; /* 58 2 */ u8 dedicated:1; /* 60: 0 1 */ /* size: 64, cachelines: 1, members: 5 */ /* padding: 3 */ /* bit_padding: 7 bits */ }; Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2021-03-251-0/+7
|\ \ \ | | |/ | |/| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5: Set QP timestamp mode to defaultMaor Gottlieb2021-03-101-0/+7
| |/ | | | | | | | | | | | | | | | | | | QPs which don't care from timestamp mode, should set the ts_format to default, otherwise the QP creation could be failed if the timestamp mode is not supported. Fixes: 2fe8d4b87802 ("RDMA/mlx5: Fail QP creation if the device can not support the CQE TS") Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | net/mlx5e: Do not reload ethernet ports when changing eswitch modeRoi Dayan2021-03-161-0/+1
| | | | | | | | | | | | | | | | | | When switching modes between legacy and switchdev and back, do not reload ethernet interfaces. just change the profile from nic profile to uplink rep profile in switchdev mode. Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | net/mlx5: Move devlink port from mlx5e priv to mlx5e resourcesRoi Dayan2021-03-161-0/+1
| | | | | | | | | | | | | | | | | | | | We re-use the native NIC port net device instance for the Uplink representor, and the devlink port. When changing profiles we reset the mlx5e priv but we should still use the devlink port so move it to mlx5e resources. Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | net/mlx5: Move mlx5e hw resources into a sub objectRoi Dayan2021-03-161-4/+6
| | | | | | | | | | | | | | | | This is to separate between resources attributes and other attributes we will want to use. Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | net/mlx5e: Allow to match on ICMP parametersMaor Dickman2021-03-121-0/+2
|/ | | | | | | | | Support matching on ICMPv4/6 type and code parameters using misc3 section of match parameters. Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2021-02-222-10/+5
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull rdma updates from Jason Gunthorpe: "This is quite a small cycle, if not for Lee's 70 patches cleaning the kdocs it would be well below typical for patch count. Most of the interesting work here was in the HNS and rxe drivers which got fairly major internal changes. Summary: - Driver updates and bug fixes: siw, hns, bnxt_re, mlx5, efa - Significant rework in rxe to get it ready to have XRC support added - Several rts bug fixes - Big series to get to 'make W=1' cleanness, primarily updating kdocs - Support for creating a RDMA MR from a DMABUF fd to allow PCI peer to peer transfers to GPU VRAM - Device disassociation now works properly with umad - Work to support more than 255 ports on a RDMA device - Further support for the new HNS HIP09 hardware - Coding style cleanups: comma to semicolon, unneded semicolon/blank lines, remove 'h' printk format, don't check for NULL before kfree, use true/false for bool" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (205 commits) RDMA/rtrs-srv: Do not pass a valid pointer to PTR_ERR() RDMA/srp: Fix support for unpopulated and unbalanced NUMA nodes RDMA/mlx5: Fail QP creation if the device can not support the CQE TS RDMA/mlx5: Allow CQ creation without attached EQs RDMA/rtrs-srv-sysfs: fix missing put_device RDMA/rtrs-srv: fix memory leak by missing kobject free RDMA/rtrs: Only allow addition of path to an already established session RDMA/rtrs-srv: Fix stack-out-of-bounds RDMA/rxe: Remove unused pkt->offset RDMA/ucma: Fix use-after-free bug in ucma_create_uevent RDMA/core: Fix kernel doc warnings for ib_port_immutable_read() RDMA/qedr: Use true and false for bool variable RDMA/hns: Adjust definition of FRMR fields RDMA/hns: Refactor process of posting CMDQ RDMA/hns: Adjust fields and variables about CMDQ tail/head RDMA/hns: Remove redundant operations on CMDQ RDMA/hns: Fixes missing error code of CMDQ RDMA/hns: Remove unused member and variable of CMDQ RDMA/ipoib: Remove racy Subnet Manager sendonly join checks RDMA/mlx5: Support 400Gbps IB rate in mlx5 driver ...
| * Merge tag 'v5.11' into rdma.git for-nextJason Gunthorpe2021-02-181-18/+0
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux 5.11 Merged to resolve conflicts with RDMA rc commits - drivers/infiniband/sw/rxe/rxe_net.c The final logic is to call rxe_get_dev_from_net() again with the master netdev if the packet was rx'd on a vlan. To keep the elimination of the local variables requires a trivial edit to the code in -rc Link: https://lore.kernel.org/r/20210210131542.215ea67c@canb.auug.org.au Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * \ Merge branch 'mlx5_timestamp' into rdma.git for-nextJason Gunthorpe2021-02-161-5/+49
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Leon Romanovsky says: ==================== Add an extra timestamp format for mlx5_ib device. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux due to dependencies. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> * branch 'mlx5_timestamp': RDMA/mlx5: Fail QP creation if the device can not support the CQE TS net/mlx5: Add new timestamp mode bits
| * | | RDMA/mlx5: Allow CQ creation without attached EQsTal Gilboa2021-02-161-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The traditional DevX CQ creation flow goes through mlx5_core_create_cq() which checks that the given EQN corresponds to an existing EQ and attaches a devx handler to the EQN for the CQ. In some cases the EQ will not be a kernel EQ, but will be controlled by modify CQ, don't block creating these just because the EQN can't be found in the kernel. Link: https://lore.kernel.org/r/20210211085549.1277674-1-leon@kernel.org Signed-off-by: Tal Gilboa <talgi@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/mlx5: Cleanup the synchronize_srcu() from the ODP flowYishai Hadas2021-02-081-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cleanup the synchronize_srcu() from the ODP flow as it was found to be a very heavy time consumer as part of dereg_mr. For example de-registration of 10000 ODP MRs each with size of 2M hugepage took 19.6 sec comparing de-registration of same number of non ODP MRs that took 172 ms. The new locking scheme uses the wait_event() mechanism which follows the use count of the MR instead of using synchronize_srcu(). By that change, the time required for the above test took 95 ms which is even better than the non ODP flow. Once fully dropped the srcu usage, had to come with a lock to protect the XA access. As part of using the above mechanism we could also clean the num_deferred_work stuff and follow the use count instead. Link: https://lore.kernel.org/r/20210202071309.2057998-1-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | IB/mlx5: Move mlx5_port_caps from mlx5_core_dev to mlx5_ib_devParav Pandit2021-02-051-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mlx5_port_caps are RDMA specific capabilities. These are not used by the mlx5_core_device at all. Move them to mlx5_ib_dev where it is used and reduce the scope of it to multiple drivers. Link: https://lore.kernel.org/r/20210203130133.4057329-2-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | | | Merge branch 'mlx5-next' of ↵David S. Miller2021-02-163-12/+102
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== pull-request: mlx5-next 2021-02-16 The patches in this pr are already submitted and reviewed through the netdev and rdma mailing lists. The series includes mlx5 HW bits and definitions for mlx5 real time clock translation and handling in the mlx5 driver clock module to enable and support such mode [1] [1] https://patchwork.kernel.org/project/netdevbpf/patch/20210212223042.449816-7-saeed@kernel.org/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net/mlx5: Move all internal timer metadata into a dedicated structEran Ben Elisha2021-02-161-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Internal timer mode (SW clock) requires some PTP clock related metadata structs. Real time mode (HW clock) will not need these metadata structs. This separation emphasize the different interfaces for HW clock and SW clock. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| * | | | net/mlx5: Add register layout to support real-time time-stampEran Ben Elisha2021-02-163-3/+33
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add needed structure layouts and defines for MTUTC (Management UTC) register. MTUTC will be used for cyc2time HW translation. In addition, add cyc2time modify capability bit and init segment HCA real time address. Finally, add capability bits indicating which time-stamping format is supported per SQ and RQ. Add ts_format in the queue's context layout to allow configuration. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| * | | net/mlx5: Add new timestamp mode bitsAharon Landau2021-02-161-5/+49
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These fields declare which timestamp mode is supported by the device per RQ/SQ/QP. In addition add the ts_format field to the select the mode for RQ/SQ/QP. Link: https://lore.kernel.org/r/20210209131107.698833-2-leon@kernel.org Signed-off-by: Aharon Landau <aharonl@nvidia.com> Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
| * | net/mlx5: Expose ifc bits for query modify headerYishai Hadas2021-01-191-0/+12
| | | | | | | | | | | | | | | | | | | | | Expose ifc bits for query_modify_header_context_in to be used by DEVX. Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
* | | net/mlx5: Delete device list leftoverLeon Romanovsky2021-02-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | Device list is not stored in mlx5_priv anymore, so delete it as it's not used. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | net/mlx5e: Match recirculated packet miss in slow table using reg_c1Vlad Buslov2021-02-051-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous patch in series that implements stack devices RX path implements indirect table rules that match on tunnel VNI. After such rule is created all tunnel traffic is recirculated to root table. However, recirculated packet might not match on any rules installed in the table (for example, when IP traffic follows ARP traffic). In that case packets appear on representor of tunnel endpoint VF instead being redirected to the VF itself. Extend slow table with additional flow group that matches on reg_c0 (source port value set by indirect tables implemented by previous patch in series) and reg_c1 (special 0xFFF mark). When creating offloads fdb tables, install one rule per VF vport to match on recirculated miss packets and redirect them to appropriate VF vport. Modify indirect tables code to also rewrite reg_c1 with special 0xFFF mark. Implementation reuses reg_c1 tunnel id bits. This is safe to do because recirculated packets are always matched before decapsulation. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | | net/mlx5e: Refactor reg_c1 usageVlad Buslov2021-02-051-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following patch in series uses reg_c1 in eswitch code. To use reg_c1 helpers in both TC and eswitch code, refactor existing helpers according to similar use case of reg_c0 and move the functionality into eswitch.h. Calculate reg mappings length from new defines to ensure that they are always in sync and only need to be changed in single place. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>