summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* vsock: remove vm_sockets_get_local_cid()Stefano Garzarella2019-11-143-13/+1
| | | | | | | | | | | vm_sockets_get_local_cid() is only used in virtio_transport_common.c. We can replace it calling the virtio_transport_get_ops() and using the get_local_cid() callback registered by the transport. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* vsock/vmci: remove unused VSOCK_DEFAULT_CONNECT_TIMEOUTStefano Garzarella2019-11-141-5/+0
| | | | | | | | | | | | | | | The VSOCK_DEFAULT_CONNECT_TIMEOUT definition was introduced with commit d021c344051af ("VSOCK: Introduce VM Sockets"), but it is never used in the net/vmw_vsock/vmci_transport.c. VSOCK_DEFAULT_CONNECT_TIMEOUT is used and defined in net/vmw_vsock/af_vsock.c Cc: Jorgen Hansen <jhansen@vmware.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'octeontx2-af-Debugfs-support-and-updates-to-parser-profile'David S. Miller2019-11-1418-4338/+14241
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sunil Goutham says: ==================== octeontx2-af: Debugfs support and updates to parser profile This patchset adds debugfs support to dump various HW state machine info which helps in debugging issues. Info includes - Current queue context, stats, resource utilization etc - MCAM entry utilization, miss and pkt drop counter - CGX ingress and egress stats - Current RVU block allocation status - etc. Rest patches has changes wrt - Updated packet parsing profile for parsing more protocols. - RSS algorithms to include inner protocols while generating hash - Handle current version of silicon's limitations wrt shaping, coloring and fixed mapping of transmit limiter queue's configuration. - Enable broadcast packet replication to PF and it's VFs. - Support for configurable NDC cache waymask - etc Changes from v1: Removed inline keyword for newly introduced APIs in few patches. - Suggested by David Miller. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Start/Stop traffic in CGX along with NPCSubbaraya Sundeep2019-11-144-14/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traffic for a CGX mapped NIXLF can be stopped by disabling entries in NPC MCAM or by configuring CGX and mailbox messages exist for the two options. If traffic is stopped at CGX then VFs of that PF are also effected hence CGX traffic should be started/stopped by tracking all the users of it. This patch implements that CGX users tracking. CGX is also configured along with NPC if required. Also removed a check which mandates even number of LBK VFs. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Add option to disable dynamic entry caching in NDCSunil Goutham2019-11-143-3/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | A config option is added to disable caching of dynamic entries like SQEs and stack pages. Also locks down all HW contexts in NDC, preventing them from being evicted. This option is useful when the queue count is large and there are huge NDC cache misses. It's trade off between SQ context misses and dynamically changing entries like SQE and stack page pointers. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Support configurable NDC cache way_maskGeetha sowjanya2019-11-143-11/+28
| | | | | | | | | | | | | | | | | | | | | | | | Each of the NIX/NPA LFs can choose which ways of their respective NDC caches should be used to cache their contexts. This enables flexible configurations like disabling caching for a LF, limiting it's context to a certain set of ways etc etc. Separate way_mask for NIX-TX and NIX-RX is not supported. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Enable broadcast packet replicationSunil Goutham2019-11-144-59/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ingress packet replication support has been added to 96xx B0 silicon. This patch enables using that feature to replicate ingress broadcast packets to PF and it's VFs. Also fixed below issues - VFs can also install NPC MCAM entry to forward broadcast pkts. Otherwise, unless PF's interface is UP, VFs will not receive bcast packets. - NPC MCAM entry is disabled when PF and all it's VFs are down. - Few corner cases in installing multicast entry list. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Support fixed transmit scheduler topologySunil Goutham2019-11-149-259/+513
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CN96xx initial silicon doesn't support all features pertaining to NIX transmit scheduling and shaping. - It supports a fixed topology of 1:1 mapped transmit limiters at all levels. - Supports DWRR only at SMQ/MDQ and TL1. - Doesn't support shaping and coloring. This patch adds HW capability structure by which each variant and skew of silicon can be differentiated by their supported features. And adds support for A0 silicon's transmit scheduler capabilities or rather limitations. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Add more RSS algorithmsKiran Kumar K2019-11-142-11/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for few more RSS key types for flow key algorithm to compute rss hash index. Following flow key types have been added. - Tunnel types like NVGRE, VXLAN, GENEVE. - L2 offload type ETH_DMAC, Here we will consider only DMAC 6 bytes. - And extension header IPV6_EXT (1 byte followed by IPV6 header - Hashing inner protocol fields for inner DMAC, IPv4/v6, TCP, UDP, SCTP. Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Clear NPC MCAM entries before updateNithin Dabilpuram2019-11-141-2/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Writing into NPC MCAM1 and MCAM0 registers are suppressed if they happened to form a reserved combination. Hence clear and disable MCAM entries before update. For HRM: [CAM(1)]<n>=1, [CAM(0)]<n>=1: Reserved. The reserved combination is not allowed. Hardware suppresses any write to CAM(0) or CAM(1) that would result in the reserved combination for any CAM bit. Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Update NPC KPU packet parsing profileHao Zheng2019-11-144-3792/+11236
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updated NPC KPU packet parsing profile with support for following - Fragmentation support for IPv4 IPv6 outer header - NIX instruction header support - QinQ with TPID of 0x8100 as non inner most vlan tag, as legacy network equipments still generate QinQ packets with this configuration. - To better support RSS for tunnelled packets, udp based tunnel protocols such as vxlan, vxlan-gpe, geneve and gtpu are now captured into a separate layer E. Consequently, the inner packet headers are pushed one layer down to LF, LG, and LH accordingly. - Support for rfc7510 mpls in udp. Up to 4 MPLS labels can be parsed and captured in one layer LE. - Parser support for DSA, extended DSA and eDSA tags right after ethernet header by Marvell SOHO and Falcon switches. For extended DSA and eDSA tags, a special PKIND of 62 is used, as these tags don't contain a tpid field. - Higig2 protocol header parsing support, added a NPC_LT_LA_HIGIG2_ETHER for a combined header of HIGIG2 and Ethernet. Add a NPC_LT_LA_IH_NIX_HIGIG2_ETHER for a combined header of nix_ih, HIGIG2 and Ethernet on egress side. Also added 2 upper flags in LA to indicate the presence of nix_ih and HIGIG2. Other changes include - IPv4.TTL==0 IPv6.HLIM==0 check - Per RFC 1858, mark fragment offset == 1 as error - TCP invalid flags check - Separate error codes for outer and inner IPv4 checksum errors. - Fix a parser error when KPU parses incoming IPSec ESP and AH packets - NPC vtag capture/strip hardware expect tag pointer to point to tpid/ethertype instead of tci. So move lb_ptr to point to tpid/ethertype. - Fix npc parser error when parsing udp packets that don't have any payload. - For a single MCAM entry to match on packets with one or stacked vlan tags combine NPC_LT_LB_STAG and NPC_LT_LB_QINQ to NPC_LT_LB_STAG_QINQ. - NVGRE to have a separate ltype LD_NVGRE instead of combined with LD_GRE. - Reserve top LD/LTYPEs to support custom KPU profile fields. Signed-off-by: Hao Zheng <haoz@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Add macro to generate mbox handlers declarationsSubbaraya Sundeep2019-11-142-134/+17
| | | | | | | | | | | | | | | | | | | | For every mailbox handler added to rvu, we are adding a function declaration in rvu header file. Cleaned this up by adding a macro to generate these declarations automatically. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Sync hw mbox with bounce buffer.Geetha sowjanya2019-11-142-13/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If mailbox client has a bounce buffer or a intermediate buffer where mbox messages are framed then copy them from there to HW buffer. If 'mbase' and 'hw_mbase' are not same then assume 'mbase' points to bounce buffer. This patch also adds msg_size field to mbox header to copy only valid data instead of whole buffer. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Add mbox API to validate all responsesSunil Goutham2019-11-144-12/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added a new mailbox API which goes through all responses to check their IDs and response codes. Also added logic to prevent queuing multiple works to process the same mailbox message. This scenario happens when AF is processing a PF's request and menawhile PF sends ACK to AF sent UP message, then mbox_hdr->num_msgs in the PF->AF DOWN mbox region will be nonzero and AF will end up processing PF's request again. This is fixed by taking a backup of num_msgs counter and clearing the same in the mbox region before scheduling work. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Add NPC MCAM entry allocation status to debugfsSunil Goutham2019-11-143-1/+213
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added support to display current NPC MCAM entries and counter's allocation status ín debugfs. cat /sys/kernel/debug/octeontx2/npc/mcam_info' will dump following info - MCAM Rx and Tx keysize - Total MCAM entries and counters - Current available count - Count of number of MCAM entries and counters allocated by a RVU PF/VF device. Also, one NPC MCAM counter (last one) is reserved and mapped to NPC RX_INTF's MISS_ACTION to count dropped packets due to no MCAM entry match. This pkt drop counter can be checked via debugfs. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Add per CGX port level NIX Rx/Tx countersLinu Cherian2019-11-145-2/+173
| | | | | | | | | | | | | | | | | | | | | | | | A CGX port is shared by a RVU PF and it's VFs. These per CGX port level NIX Rx/Tx counters are cumilative stats of all NIXLFs sharing this port. These stats when compared to CGX Rx/Tx stats helps in identifying pkts dropped within the system, if any. Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Add CGX LMAC stats to debugfsPrakash Brahmajyosyula2019-11-142-0/+163
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds CGX LMAC physical interface or serdes Rx/Tx packet stats to debugfs. 'cat cgx<idx>/lmac<idx>/stats' dumps the current interface link status and Rx/Tx stats. Stats include pkt received/transmitted, dropped, pause frames etc etc. Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com> Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Add NDC block stats to debugfs.Prakash Brahmajyosyula2019-11-145-20/+201
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NDC is a data cache unit which caches NPA and NIX block's aura/pool/RQ/SQ/CQ/etc contexts to reduce number of costly DRAM accesses. This patch adds support to dump cache's performance stats like cache line hit/miss counters, average cycles taken for accessing cached and non-cached data. This will help in checking if NPA/NIX context reads/writes are having NDC cache misses which inturn might effect performance. Also changed NDC enums to reflect correct NDC hardware instance. Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com> Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Add NIX RQ, SQ and CQ contexts to debugfsPrakash Brahmajyosyula2019-11-143-2/+507
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To aid in debugging NIX block related issues, added support to dump NIX block LF's RQ, SQ and CQ hardware contexts in debugfs. User can check which contexts are enabled currently and dump it's current HW context. Four new files 'qsize', 'rq_ctx', 'sq_ctx' and 'cq_ctx' are added to the debugfs at 'sys/kernel/debug/octeontx2/nix/' 'echo <nixlf index> > qsize' will display current enabled CQ/SQ/RQs. 'echo <nixlf> [rq number/all] > rq_ctx', 'echo <nixlf> [sq number/all] > sq_ctx' & 'echo <nixlf> [cq number/all] > cq_ctx' will dump RQ/SQ/CQ's current hardware context. Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Add NPA aura and pool contexts to debugfsChristina Jacob2019-11-143-2/+530
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To aid in debugging NPA related issues, added support to dump NPA (pool allocator) block LF's aura and pool hardware contexts in debugfs. User can check which contexts are enabled currently and dump it's current HW context. Three new files 'qsize', 'aura_ctx', 'pool_ctx' are added to the debugfs at 'sys/kernel/debug/octeontx2/npa/' 'echo <npalf index> > qsize' will display current enabled Aura/Pools. 'echo <npalf> [aura number/all] > aura_ctx' & 'echo <npalf> [aura number/all] > pool_ctx' will dump Aura/Pool context info. Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Dump current resource provisioning statusChristina Jacob2019-11-144-1/+169
|/ | | | | | | | | | | | | | | Added support to dump current resource provisioning status of all resource virtualization unit (RVU) block's (i.e NPA, NIX, SSO, SSOW, CPT, TIM) local functions attached to a PF_FUNC into a debugfs file. 'cat /sys/kernel/debug/octeontx2/rsrc_alloc' will show the current block LF's allocation status. Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: mvneta: fix build skb for bm capable devicesLorenzo Bianconi2019-11-141-2/+2
| | | | | | | | | | | | | | | Fix build_skb for bm capable devices when they fall-back using swbm path (e.g. when bm properties are configured in device tree but CONFIG_MVNETA_BM_ENABLE is not set). In this case rx_offset_correction is overwritten so we need to use it building skb instead of MVNETA_SKB_HEADROOM directly Fixes: 8dc9a0888f4c ("net: mvneta: rely on build_skb in mvneta_rx_swbm poll routine") Fixes: 0db51da7a8e9 ("net: mvneta: add basic XDP support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reported-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge tag 'mlx5-updates-2019-11-12' of ↵David S. Miller2019-11-1426-467/+887
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2019-11-12 1) Merge mlx5-next for devlink reload and flowtable offloads dependencies 2) Devlink reload support 3) TC Flowtable offloads 4) Misc cleanup ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5: TC: Offload flow table rulesPaul Blakey2019-11-133-5/+71
| | | | | | | | | | | | | | | | | | | | | | | | Since both tc rules and flow table rules are of the same format, we can re-use tc parsing for that, and move the flow table rules to their steering domain - In this case, the next chain after max tc chain. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5: Add devlink reloadMichael Guralnik2019-11-133-2/+25
| | | | | | | | | | | | | | | | | | | | | | Implement devlink reload for mlx5. Usage example: devlink dev reload pci/0000:06:00.0 Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5e: Set netdev name space on creationMichael Guralnik2019-11-133-0/+9
| | | | | | | | | | | | | | | | | | | | Use devlink instance name space to set the netdev net namespace. Preparation patch for devlink reload implementation. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5e: Fix error flow cleanup in mlx5e_tc_tun_create_header_ipv4/6Eli Cohen2019-11-131-6/+12
| | | | | | | | | | | | | | | | | | | | Be sure to release the neighbour in case of failures after successful route lookup. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5: Remove redundant NULL initializationsEli Cohen2019-11-131-4/+4
| | | | | | | | | | | | | | | | | | | | | | Neighbour initializations to NULL are not necessary as the pointers are not used if an error is returned, and if success returned, pointers are initialized. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5: Read num_vfs before disabling SR-IOVParav Pandit2019-11-131-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mlx5_device_disable_sriov() currently reads num_vfs from the PCI core. However when mlx5_device_disable_sriov() is executed, SR-IOV is already disabled at the PCI level. Due to this disable_hca() cleanup is not done during SR-IOV disable flow. mlx5_sriov_disable() pci_enable_sriov() mlx5_device_disable_sriov() <- num_vfs is zero here. When SR-IOV enablement fails during mlx5_sriov_enable(), HCA's are left in enabled stage because mlx5_device_disable_sriov() relies on num_vfs from PCI core. mlx5_sriov_enable() mlx5_device_enable_sriov() pci_enable_sriov() <- Fails Hence, to overcome above issues, (a) Read num_vfs before disabling SR-IOV and use it. (b) Use num_vfs given when enabling sriov in error unwinding path. Fixes: d886aba677a0 ("net/mlx5: Reduce dependency on enabled_vfs counter and num_vfs") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5: DR, Fix matcher builders select checkAlex Vesker2019-11-131-1/+1
| | | | | | | | | | | | | | | | | | When selecting a matcher ste_builder_arr will always be evaluated as true, instead check if num_of_builders is set for validity. Fixes: 667f264676c7 ("net/mlx5: DR, Support IPv4 and IPv6 mixed matcher") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * Merge branch 'mlx5-next' of ↵Saeed Mahameed2019-11-1317-444/+759
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux 1) New generic devlink param "enable_roce", for downstream devlink reload support 2) Do vport ACL configuration on per vport basis when enabling/disabling a vport. This enables to have vports enabled/disabled outside of eswitch config for future 3) Split the code for legacy vs offloads mode and make it clear 4) Tide up vport locking and workqueue usage 5) Fix metadata enablement for ECPF 6) Make explicit use of VF property to publish IB_DEVICE_VIRTUAL_FUNCTION 7) E-Switch and flow steering core low level support and refactoring for netfilter flowtables offload Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: Add new chain for netfilter flow table offloadPaul Blakey2019-11-133-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Netfilter tables (nftables) implements a software datapath that comes after tc ingress datapath. The datapath supports offloading such rules via the flow table offload API. This API is currently only used by NFT and it doesn't provide the global priority in regards to tc offload, so we assume offloading such rules must come after tc. It does provide a flow table priority parameter, so we need to provide some supported priority range. For that, split fastpath prio to two, flow table offload and tc offload, with one dedicated priority chain for flow table offload. Next patch will re-use the multi chain API to access this chain by allowing access to this chain by the fdb_sub_namespace. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: Refactor creating fast path prio chainsPaul Blakey2019-11-131-36/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Next patch will re-use this to add a new chain but in a different prio. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: Accumulate levels for chains prio namespacesPaul Blakey2019-11-132-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tc chains are implemented by creating a chained prio steering type, and inside it there is a namespace for each chain (FDB_TC_MAX_CHAINS). Each of those has a list of priorities. Currently, all namespaces in a prio start at the parent prio level. But since we can jump from chain (namespace) to another chain in the same prio, we need the levels for higher chains to be higher as well. So we created unused prios to account for levels in previous namespaces. Fix that by accumulating the namespaces levels if we are inside a chained type prio, and removing the unused prios. Fixes: 328edb499f99 ('net/mlx5: Split FDB fast path prio to multiple namespaces') Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: Define fdb tc levels per prioPaul Blakey2019-11-132-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define FDB_TC_LEVELS_PER_PRIO instead of magic number 2. This is the number of levels used by each tc prio table in the fdb. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: Rename FDB_* tc related defines to FDB_TC_* definesPaul Blakey2019-11-134-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename it to prepare for next patch that will add a different type of offload to the FDB. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: Simplify fdb chain and prio eswitch definesPaul Blakey2019-11-131-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FDB_MAX_CHAIN and FDB_MAX_PRIO were defined differently depending on if CONFIG_MLX5_ESWITCH is enabled to save space on allocations. This is a minor space saving, and there is no real need for it. Simplify things instead, and define them the same in both cases. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * IB/mlx5: Load profile according to RoCE enablement stateMichael Guralnik2019-11-111-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When RoCE is disabled load mlx5_ib in raw_eth profile. Clean pf_profile roce capability checks as it will not be used without roce capability. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * IB/mlx5: Rename profile and init methodsMichael Guralnik2019-11-113-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename uplink_rep_profile and its unique init and cleanup stages to suit its upcoming use as the profile when RoCE is disabled. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: Handle "enable_roce" devlink paramMichael Guralnik2019-11-114-0/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Register "enable_roce" param, default value is RoCE enabled. Current configuration is stored on mlx5_core_dev and exposed to user through the cmode runtime devlink param. Changing configuration requires changing the cmode driverinit devlink param and calling devlink reload. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: Document flow_steering_mode devlink paramMichael Guralnik2019-11-111-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | Add documentation for current mlx5 supported devlink param. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * devlink: Add new "enable_roce" generic device paramMichael Guralnik2019-11-113-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | New device parameter to enable/disable handling of RoCE traffic in the device. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: fix spelling mistake "metdata" -> "metadata"Colin Ian King2019-11-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | There is a spelling mistake in a esw_warn warning message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: fix kvfree of uninitialized pointer specColin Ian King2019-11-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently when a call to esw_vport_create_legacy_ingress_acl_group fails the error exit path to label 'out' will cause a kvfree on the uninitialized pointer spec. Fix this by ensuring pointer spec is initialized to NULL to avoid this issue. Addresses-Coverity: ("Uninitialized pointer read") Fixes: 10652f39943e ("net/mlx5: Refactor ingress acl configuration") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * IB/mlx5: Introduce and use mlx5_core_is_vf()Parav Pandit2019-11-012-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of deciding a given device is virtual function or not based on a device is PF or not, use already defined MLX5_COREDEV_VF by introducing an helper API mlx5_core_is_vf(). This enables to clearly identify PF, VF and non virtual functions. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: E-switch, Enable metadata on own vportParav Pandit2019-11-013-23/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently on ECPF, metadata is enabled on the ECPF vport = 0xfffe (manager vport). Metadata when supported, must be enabled on own vport which is used to pass metadata to vport of NIC Rx Flow Table. Due to this error, traffic tagged by ingress ACL is not processed correctly at NIC rx flow table level which is supposed to work on metadata tag. Hence, instead of working on eswitch manager vport, always working on eswitch own vport regardless of PF or ECPF. Given that mlx5_eswitch_query/modify_esw_vport_context() is used to access other vport in legacy mode and own vport settings in switchdev mode, extend low level API to explicitly specify other_vport. Fixes: c1286050cf47 ("net/mlx5: E-Switch, Pass metadata from FDB to eswitch manager") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: Refactor ingress acl configurationParav Pandit2019-11-013-114/+200
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop, untagged, spoof check and untagged spoof check flow groups are limited to legacy mode only. Therefore, following refactoring is done to (a) improve code readability (b) have better code split between legacy and offloads mode 1. Move legacy flow groups under legacy structure 2. Add validity check for group deletion 3. Restrict scope of esw_vport_disable_ingress_acl to legacy mode 4. Rename esw_vport_enable_ingress_acl() to esw_vport_create_ingress_acl_table() and limit its scope to table creation 5. Introduce legacy flow groups creation helper esw_legacy_create_ingress_acl_groups() and keep its scope to legacy mode 6. Reduce offloads ingress groups from 4 to just 1 metadata group per vport 7. Removed redundant IS_ERR_OR_NULL as entries are marked NULL on free. 8. Shortern error message to remove redundant 'E-switch' Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: Restrict metadata disablement to offloads modeParav Pandit2019-11-013-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Now that there is clear separation for acl setup/cleanup between legacy and offloads mode, limit metdata disablement to offloads mode. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: E-switch, Offloads shift ACL programming during enable/disable vportVu Pham2019-11-013-31/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently legacy mode enables ACL while enabling vport, while offloads mode enable ACL when moving to offloads mode. Bring consistency to both modes by enabling/disabling ACL when enabling/disabling a vport. It also eliminates creating ingress ACL table on unused ECPF vport in offloads mode. Signed-off-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: E-switch, Offloads introduce and use per vport acl tables APIsParav Pandit2019-11-011-17/+32
| | | | | | | | | | | | | | | | | | | | | | | | Introduce and use per vport ACL tables creation and destroy APIs, so that subsequently patch can use them during enabling/disabling a vport. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>