summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
Commit message (Collapse)AuthorAgeFilesLines
* mlx5: do not use RT_TOS for IPv6 flowlabelMatthias May2022-08-091-2/+2
| | | | | | | | | | | | | | | | | | | According to Guillaume Nault RT_TOS should never be used for IPv6. Quote: RT_TOS() is an old macro used to interprete IPv4 TOS as described in the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4 code, although, given the current state of the code, most of the existing calls have no consequence. But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS" field to be interpreted the RFC 1349 way. There's no historical compatibility to worry about. Fixes: ce99f6b97fcd ("net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels") Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Matthias May <matthias.may@westermo.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net/mlx5e: TC, fix decap fallback to uplink when int port not supportedAriel Levkovich2022-05-041-1/+2
| | | | | | | | | | | | | | | | | | | | | When resolving the decap route device for a tunnel decap rule, the result may be an OVS internal port device. Prior to adding the support for internal port offload, such case would result in using the uplink as the default decap route device which allowed devices that can't support internal port offload to offload this decap rule. This behavior got broken by adding the internal port offload which will fail in case the device can't support internal port offload. To restore the old behavior, use the uplink device as the decap route as before when internal port offload is not supported. Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* mlx5: Don't accidentally set RTO_ONLINK before mlx5e_route_lookup_ipv4_get()Guillaume Nault2022-01-111-2/+3
| | | | | | | | | | | | | | Mask the ECN bits before calling mlx5e_route_lookup_ipv4_get(). The tunnel key might have the last ECN bit set. This interferes with the route lookup process as ip_route_output_key_hash() interpretes this bit specially (to restrict the route scope). Found by code inspection, compile tested only. Fixes: c7b9038d8af6 ("net/mlx5e: TC preparation refactoring for routing update event") Fixes: 9a941117fb76 ("net/mlx5e: Maximize ip tunnel key usage on the TC offloading path") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net/mlx5e: Specify out ifindex when looking up decap routeChris Mi2021-11-161-11/+12
| | | | | | | | | | | | | | | | | | There is a use case that the local and remote VTEPs are in the same host. Currently, the out ifindex is not specified when looking up the decap route for offloads. So in this case, a local route is returned and the route dev is lo. Actual tunnel interface can be created with a parameter "dev" [1], which specifies the physical device to use for tunnel endpoint communication. Pass this parameter to driver when looking up decap route for offloads. So that a unicast route will be returned. [1] ip link add name vxlan1 type vxlan id 100 dev enp4s0f0 remote 1.1.1.1 dstport 4789 Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5: Support internal port as decap route deviceAriel Levkovich2021-10-291-10/+19
| | | | | | | | | | | | | | When performing route device lookup for decap action, support the case of ovs internal port as the lookup result. In such case, an internal port struct is mapped and attached to the flow attributes so that the source port matching of the rule will match on the internal port's metadata value. Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: Offload internal port as encap route deviceAriel Levkovich2021-10-291-1/+2
| | | | | | | | | | | | | | | | | | | | | When pefroming encap action, a route lookup is performed to find the routing device the packet should be forwarded to after the encapsulation. This is the device that has the local tunnel ip address. This change adds support to offload an encap rule where the route device ends up being an ovs internal port. In such case, the driver will add a HW rule that will encapsulate the packet with the tunnel header and will overwrite the vport metadata in reg_c0 to the internal port metadata value. Finally, the packet will be forwarded to the root table to be processed again with the indication that it came from an internal port. Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* mlx5: fix build after mergeJakub Kicinski2021-10-221-2/+2
| | | | | | | | | Silent merge conflict between these two: 3d677735d3b7 ("net/mlx5: Lag, move lag files into directory") 14fe2471c628 ("net/mlx5: Lag, change multipath and bonding to be mutually exclusive") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2021-10-221-0/+2
|\ | | | | | | | | | | | | | | Lots of simnple overlapping additions. With a build fix from Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5: Lag, change multipath and bonding to be mutually exclusiveMaor Dickman2021-10-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both multipath and bonding events are changing the HW LAG state independently. Handling one of the features events while the other is already enabled can cause unwanted behavior, for example handling bonding event while multipath enabled will disable the lag and cause multipath to stop working. Fix it by ignoring bonding event while in multipath and ignoring FIB events while in bonding mode. Fixes: 544fe7c2e654 ("net/mlx5e: Activate HW multipath and handle port affinity based on FIB events") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | net/mlx5e: Specify out ifindex when looking up encap routeChris Mi2021-10-041-0/+8
|/ | | | | | | | | | | | | | | | | | | There is a use case that the local and remote VTEPs are in the same host. Currently, the out ifindex is not specified when looking up the encap route for offloads. So in this case, a local route is returned and the route dev is lo. Actual tunnel interface can be created with a parameter "dev" [1], which specifies the physical device to use for tunnel endpoint communication. Pass this parameter to driver when looking up encap route for offloads. So that a unicast route will be returned. [1] ip link add name vxlan1 type vxlan id 100 dev enp4s0f0 remote 1.1.1.1 dstport 4789 Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2021-08-131-0/+5
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h 9e26680733d5 ("bnxt_en: Update firmware call to retrieve TX PTP timestamp") 9e518f25802c ("bnxt_en: 1PPS functions to configure TSIO pins") 099fdeda659d ("bnxt_en: Event handler for PPS events") kernel/bpf/helpers.c include/linux/bpf-cgroup.h a2baf4e8bb0f ("bpf: Fix potentially incorrect results with bpf_get_local_storage()") c7603cfa04e7 ("bpf: Add ambient BPF runtime context stored in current") drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c 5957cc557dc5 ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray") 2d0b41a37679 ("net/mlx5: Refcount mlx5_irq with integer") MAINTAINERS 7b637cd52f02 ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo") 7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * net/mlx5e: Avoid creating tunnel headers for local routeRoi Dayan2021-08-091-0/+5
| | | | | | | | | | | | | | | | | | | | | | It could be local and remote are on the same machine and the route result will be a local route which will result in creating encap id with src/dst mac address of 0. Fixes: a54e20b4fcae ("net/mlx5e: Add basic TC tunnel set action for SRIOV offloads") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | net/mlx5: Fix typo in commentsCai Huoqing2021-08-111-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | Fix typo: *vectores ==> vectors *realeased ==> released *erros ==> errors *namepsace ==> namespace *trafic ==> traffic *proccessed ==> processed *retore ==> restore *Currenlty ==> Currently *crated ==> created *chane ==> change *cannnot ==> cannot *usuallly ==> usually *failes ==> fails *importent ==> important *reenabled ==> re-enabled *alocation ==> allocation *recived ==> received *tanslation ==> translation Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5: Added new parameters to reformat contextYevgeny Kliteynik2021-06-091-12/+26
| | | | | | | | | | | | | | | | | | | | Adding new reformat context type (INSERT_HEADER) requires adding two new parameters to reformat context - reformat_param_0 and reformat_param_1. As defined by HW spec, these parameters have different meaning for different reformat context type. The first parameter (reformat_param_0) is not new to HW spec, but it wasn't used by any of the supported reformats. The second parameter (reformat_param_1) is new to the HW spec - it was added to allow supporting INSERT_HEADER. For NSERT_HEADER, reformat_param_0 indicates the header used to reference the location of the inserted header, and reformat_param_1 indicates the offset of the inserted header from the reference point defined by reformat_param_0. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: Check correct ip_version in decapsulation route resolutionRoi Dayan2021-03-101-4/+4
| | | | | | | | | | | | | | flow_attr->ip_version has the matching that should be done inner/outer. When working with chains, decapsulation is done on chain0 and next chain match on outer header which is the original inner which could be ipv4. So in tunnel route resolution we cannot use that to know which ip version we are at so save tun_ip_version when parsing the tunnel match and use that. Fixes: a508728a4c8b ("net/mlx5e: VF tunnel RX traffic offloading") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Dmytro Linkin <dlinkin@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: TC preparation refactoring for routing update eventVlad Buslov2021-02-051-0/+198
| | | | | | | | | | | | | | | | | | | | | | | | Following patch in series implement routing update event which requires ability to modify rule match_to_reg modify header actions dynamically during rule lifetime. In order to accommodate such behavior, refactor and extend TC infrastructure in following ways: - Modify mod_hdr infrastructure to preserve its parse attribute for whole rule lifetime, instead of deallocating it after rule creation. - Extend match_to_reg infrastructure with new function mlx5e_tc_match_to_reg_set_and_get_id() that returns mod_hdr action id that can be used afterwards to update the action, and mlx5e_tc_match_to_reg_mod_hdr_change() that can modify existing actions by its id. - Extend tun API with new functions mlx5e_tc_tun_update_header_ipv{4|6}() that are used to updated existing encap entry tunnel header. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: Refactor neigh update infrastructureVlad Buslov2021-02-051-14/+8
| | | | | | | | | | | | | | | | | | | | | | Following patches in series implements route update which can cause encap entries to migrate between routing devices. Consecutively, their parent nhe's need to be also transferable between devices instead of having neigh device as a part of their immutable key. Move neigh device from struct mlx5_neigh to struct mlx5e_neigh_hash_entry and check that nhe and neigh devices are the same in workqueue neigh update handler. Save neigh net_device that can change dynamically in dedicated nhe->dev field. With FIB event handler that is implemented in following patches changing nhe->dev, NETEVENT_DELAY_PROBE_TIME_UPDATE handler can concurrently access the nhe entry when traversing neigh list under rcu read lock. Processing stale values in that handler doesn't change the handler logic, so just wrap all accesses to the dev pointer in {WRITE|READ}_ONCE() helpers. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: Extract tc tunnel encap/decap code to dedicated fileVlad Buslov2021-02-051-0/+1
| | | | | | | | | | | | | | Following patches in series extend the extracted code with routing infrastructure. To improve code modularity created a dedicated tc_tun_encap.c source file and move encap/decap related code to the new file. Export code that is used by both regular TC code and encap/decap code into tc_priv.h (new header intended to be used only by TC module). Rename some exported functions by adding "mlx5e_" prefix to their names. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: VF tunnel RX traffic offloadingVlad Buslov2021-02-051-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When tunnel endpoint is on VF the encapsulated RX traffic is exposed on the representor of the VF without any further processing of rules installed on the VF. Detect such case by checking if the device returned by route lookup in decap rule handling code is a mlx5 VF and handle it with new redirection tables API. Example TC rules for VF tunnel traffic: 1. Rule that encapsulates the tunneled flow and redirects packets from source VF rep to tunnel device: $ tc -s filter show dev enp8s0f0_1 ingress filter protocol ip pref 4 flower chain 0 filter protocol ip pref 4 flower chain 0 handle 0x1 dst_mac 0a:40:bd:30:89:99 src_mac ca:2e:a7:3f:f5:0f eth_type ipv4 ip_tos 0/0x3 ip_flags nofrag in_hw in_hw_count 1 action order 1: tunnel_key set src_ip 7.7.7.5 dst_ip 7.7.7.1 key_id 98 dst_port 4789 nocsum ttl 64 pipe index 1 ref 1 bind 1 installed 411 sec used 411 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 no_percpu used_hw_stats delayed action order 2: mirred (Egress Redirect to device vxlan_sys_4789) stolen index 1 ref 1 bind 1 installed 411 sec used 0 sec Action statistics: Sent 5615833 bytes 4028 pkt (dropped 0, overlimits 0 requeues 0) Sent software 0 bytes 0 pkt Sent hardware 5615833 bytes 4028 pkt backlog 0b 0p requeues 0 cookie bb406d45d343bf7ade9690ae80c7cba4 no_percpu used_hw_stats delayed 2. Rule that redirects from tunnel device to UL rep: $ tc -s filter show dev vxlan_sys_4789 ingress filter protocol ip pref 4 flower chain 0 filter protocol ip pref 4 flower chain 0 handle 0x1 dst_mac ca:2e:a7:3f:f5:0f src_mac 0a:40:bd:30:89:99 eth_type ipv4 enc_dst_ip 7.7.7.5 enc_src_ip 7.7.7.1 enc_key_id 98 enc_dst_port 4789 enc_tos 0 ip_flags nofrag in_hw in_hw_count 1 action order 1: tunnel_key unset pipe index 2 ref 1 bind 1 installed 434 sec used 434 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 used_hw_stats delayed action order 2: mirred (Egress Redirect to device enp8s0f0_1) stolen index 4 ref 1 bind 1 installed 434 sec used 0 sec Action statistics: Sent 129936 bytes 1082 pkt (dropped 0, overlimits 0 requeues 0) Sent software 0 bytes 0 pkt Sent hardware 129936 bytes 1082 pkt backlog 0b 0p requeues 0 cookie ac17cf398c4c69e4a5b2f7aabd1b88ff no_percpu used_hw_stats delayed Co-developed-by: Dmytro Linkin <dlinkin@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: Remove redundant match on tunnel destination macVlad Buslov2021-02-051-8/+0
| | | | | | | | | | | | | Remove hardcoded match on tunnel destination MAC address. Such match is no longer required and would be wrong for stacked devices topology where encapsulation destination MAC address will be the address of tunnel VF that can change dynamically on route change (implemented in following patches in the series). Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: Refactor tun routing helpersVlad Buslov2021-02-051-109/+126
| | | | | | | | | | | | | | | | | Refactor tun routing helpers to use dedicated struct mlx5e_tc_tun_route_attr instead of multiple output arguments. This simplifies the callers (no need to keep track of bunch of output param pointers) and allows to unify struct release code in new mlx5e_tc_tun_route_attr_cleanup() helper instead of requiring callers to manually release some of the output parameters that require it. Simplify code by unifying error handling at the end of the function and rearranging code. Remove redundant empty line. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: Protect encap route dev from concurrent releaseVlad Buslov2020-11-051-26/+46
| | | | | | | | | | | | | | | | | | | | | | In functions mlx5e_route_lookup_ipv{4|6}() route_dev can be arbitrary net device and not necessary mlx5 eswitch port representor. As such, in order to ensure that route_dev is not destroyed concurrent the code needs either explicitly take reference to the device before releasing reference to rtable instance or ensure that caller holds rtnl lock. First approach is chosen as a fix since rtnl lock dependency was intentionally removed from mlx5 TC layer. To prevent unprotected usage of route_dev in encap code take a reference to the device before releasing rt. Don't save direct pointer to the device in mlx5_encap_entry structure and use ifindex instead. Modify users of route_dev pointer to properly obtain the net device instance from its ifindex. Fixes: 61086f391044 ("net/mlx5e: Protect encap hash table with mutex") Fixes: 6707f74be862 ("net/mlx5e: Update hw flows when encap source mac changed") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: Optimize performance for IPv4/IPv6 ethertypeEli Britstein2020-05-271-4/+4
| | | | | | | | | The HW is optimized for IPv4/IPv6. For such cases, pending capability, avoid matching on ethertype, and use ip_version field instead. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5e: Helper function to set ethertypeEli Britstein2020-05-271-8/+13
| | | | | | | | | Set ethertype match in a helper function as a pre-step towards optimizing it. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5e: Add support for hw encapsulation of MPLS over UDPEli Cohen2020-05-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MPLS over UDP is supported by adding a rule on a representor net device which does tunnel_key set, push mpls and forward to a baredup device. At the hardware level we use a packet_reformat_context object to do the encapsulation of the packet. The resulting packet looks as follows (left side transmitted first): outer L2 | outer IP | UDP | MPLS | inner L3 and data | Example usage: tc filter add dev $rep0 protocol ip prio 1 root flower skip_sw \ action tunnel_key set src_ip 8.8.8.21 dst_ip 8.8.8.24 id 555 \ dst_port 6635 tos 4 ttl 6 csum action mpls push protocol 0x8847 \ label 555 tc 3 action mirred egress redirect dev bareudp0 This is how the filter is shown with tc filter show: tc filter show dev enp59s0f0_0 ingress filter protocol ip pref 1 flower chain 0 filter protocol ip pref 1 flower chain 0 handle 0x1 eth_type ipv4 skip_sw in_hw in_hw_count 1 action order 1: tunnel_key set src_ip 8.8.8.21 dst_ip 8.8.8.24 key_id 555 dst_port 6635 csum tos 0x4 ttl 6 pipe index 1 ref 1 bind 1 action order 2: mpls push protocol mpls_uc label 555 tc 3 ttl 255 pipe index 1 ref 1 bind 1 action order 3: mirred (Egress Redirect to device bareudp0) stolen index 1 ref 1 bind 1 Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5e: Extract neigh-specific code from en_rep.c to rep/neigh.cVlad Buslov2020-05-221-0/+1
| | | | | | | | | | | As a preparation for introducing new kconfig option that controls compilation of all TC offloads code in mlx5, extract neigh-specific code from en_rep.c to standalone file. This allows easily compiling out the code by only including new source in make file when corresponding kconfig is enabled instead of adding multiple ifdef blocks to en_rep. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5e: Extract TC-specific code from en_rep.c to rep/tc.cVlad Buslov2020-05-221-0/+1
| | | | | | | | | | | As a preparation for introducing new kconfig option that controls compilation of all TC offloads code in mlx5, extract TC-specific code from en_rep.c to standalone file. This allows easily compiling out the code by only including new source in make file when corresponding kconfig is enabled instead of adding multiple ifdef blocks to en_rep. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5e: Use IS_ERR() to check and simplify codeTang Bin2020-05-221-3/+2
| | | | | | | | | | Use IS_ERR() and PTR_ERR() instead of PTR_ERR_OR_ZERO() to simplify code, avoid redundant judgements. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5: Avoid forwarding to other eswitch uplinkEli Cohen2020-03-171-0/+3
| | | | | | | | | Do not allow forwarding of encapsulated traffic received from one eswtich's uplink to another eswtich's uplink. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5e: Move tc tunnel parsing logic with the rest at tc_tun modulePaul Blakey2020-02-191-2/+110
| | | | | | | | | | | | | | | Currently, tunnel parsing is split between en_tc and tc_tun. The next patch will replace the tunnel fields matching with a register match, and will not need this parsing. Move the tunnel parsing logic to tc_tun as a pre-step for skipping it in the next patch. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds2019-12-081-5/+5
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) More jumbo frame fixes in r8169, from Heiner Kallweit. 2) Fix bpf build in minimal configuration, from Alexei Starovoitov. 3) Use after free in slcan driver, from Jouni Hogander. 4) Flower classifier port ranges don't work properly in the HW offload case, from Yoshiki Komachi. 5) Use after free in hns3_nic_maybe_stop_tx(), from Yunsheng Lin. 6) Out of bounds access in mqprio_dump(), from Vladyslav Tarasiuk. 7) Fix flow dissection in dsa TX path, from Alexander Lobakin. 8) Stale syncookie timestampe fixes from Guillaume Nault. [ Did an evil merge to silence a warning introduced by this pull - Linus ] * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits) r8169: fix rtl_hw_jumbo_disable for RTL8168evl net_sched: validate TCA_KIND attribute in tc_chain_tmplt_add() r8169: add missing RX enabling for WoL on RTL8125 vhost/vsock: accept only packets with the right dst_cid net: phy: dp83867: fix hfs boot in rgmii mode net: ethernet: ti: cpsw: fix extra rx interrupt inet: protect against too small mtu values. gre: refetch erspan header from skb->data after pskb_may_pull() pppoe: remove redundant BUG_ON() check in pppoe_pernet tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE() tcp: tighten acceptance of ACKs not matching a child socket tcp: fix rejected syncookies due to stale timestamps lpc_eth: kernel BUG on remove tcp: md5: fix potential overestimation of TCP option space net: sched: allow indirect blocks to bind to clsact in TC net: core: rename indirect block ingress cb function net-sysfs: Call dev_hold always in netdev_queue_add_kobject net: dsa: fix flow dissection on Tx path net/tls: Fix return values to avoid ENOTSUPP net: avoid an indirect call in ____sys_recvmsg() ...
| * net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookupSabrina Dubroca2019-12-041-4/+4
|/ | | | | | | | | | | | | | | | | | | | ipv6_stub uses the ip6_dst_lookup function to allow other modules to perform IPv6 lookups. However, this function skips the XFRM layer entirely. All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the ip_route_output_key and ip_route_output helpers) for their IPv4 lookups, which calls xfrm_lookup_route(). This patch fixes this inconsistent behavior by switching the stub to ip6_dst_lookup_flow, which also calls xfrm_lookup_route(). This requires some changes in all the callers, as these two functions take different arguments and have different return types. Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx5e: Fix build error without IPV6YueHaibing2019-11-301-36/+38
| | | | | | | | | | | | | | | | | | | | | | If IPV6 is not set and CONFIG_MLX5_ESWITCH is y, building fails: drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c:322:5: error: redefinition of mlx5e_tc_tun_create_header_ipv6 int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c:7:0: drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h:67:1: note: previous definition of mlx5e_tc_tun_create_header_ipv6 was here mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use #ifdef to guard this, also move mlx5e_route_lookup_ipv6 to cleanup unused warning. Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: e689e998e102 ("net/mlx5e: TC, Stub out ipv6 tun create header function") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx5e: Remove redundant pointer checkEli Cohen2019-11-221-12/+10
| | | | | | | | | | | | | When code reaches the "out" label, n is guaranteed to be valid so we can unconditionally call neigh_release. Also change the label to release_neigh to better reflect the fact that we unconditionally free the neighbour and also match other labels convention. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5e: TC, Stub out ipv6 tun create header functionSaeed Mahameed2019-11-221-4/+0
| | | | | | | | | | | | | | | Improve mlx5e_route_lookup_ipv6 function structure by avoiding #ifdef then return -EOPNOTSUPP in the middle of the function code. To do so, we stub out mlx5e_tc_tun_create_header_ipv6 which is the only caller of this helper function to avoid calling it altogether when ipv6 is compiled out, which should also cleanup some compiler warnings of unused variables. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5e: Fix error flow cleanup in mlx5e_tc_tun_create_header_ipv4/6Eli Cohen2019-11-131-6/+12
| | | | | | | | | | Be sure to release the neighbour in case of failures after successful route lookup. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5: Remove redundant NULL initializationsEli Cohen2019-11-131-4/+4
| | | | | | | | | | | Neighbour initializations to NULL are not necessary as the pointers are not used if an error is returned, and if success returned, pointers are initialized. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5: Fix rtable reference leakParav Pandit2019-10-291-3/+9
| | | | | | | | | | | If the rt entry gateway family is not AF_INET for multipath device, rtable reference is leaked. Hence, fix it by releasing the reference. Fixes: 5fb091e8130b ("net/mlx5e: Use hint to resolve route when in HW multipath mode") Fixes: e32ee6c78efa ("net/mlx5e: Support tunnel encap over tagged Ethernet") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5: Add flow steering actions to fs_cmd shim layerMaor Gottlieb2019-09-031-13/+14
| | | | | | | | | | Add flow steering actions: modify header and packet reformat to the fs_cmd shim layer. This allows each namespace to define possibly different functionality for alloc/dealloc action commands. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5e: Rely on rcu instead of rtnl lock when getting upper devVlad Buslov2019-07-291-1/+13
| | | | | | | | | | | Function netdev_master_upper_dev_get() generates warning if caller doesn't hold rtnl lock. Modify rules update path to use rcu version of that function. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5e: Simplify get_route_and_out_devs helper functionEli Britstein2019-07-291-12/+7
| | | | | | | | | The helper function has "if" branches that do the same. Merge them to simplify the code. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net: flow_offload: rename tc_cls_flower_offload to flow_cls_offloadPablo Neira Ayuso2019-07-091-3/+3
| | | | | | | | | | | | And any other existing fields in this structure that refer to tc. Specifically: * tc_cls_flower_offload_flow_rule() to flow_cls_offload_flow_rule(). * TC_CLSFLOWER_* to FLOW_CLS_*. * tc_cls_common_offload to tc_cls_common_offload. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx5e: Disallow tc redirect offload cases we don't supportPaul Blakey2019-06-281-1/+3
| | | | | | | | | | | | | | After changing the parent_id to be the same for both NICs of same the hardware device, netdev_port_same_parent_id now returns true for more cases (all the lower devices in the hierarchy are on the same hardware device). If merged eswitch isn't enabled, these cases aren't supported, so disallow them. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2019-06-171-5/+6
|\ | | | | | | | | | | | | Honestly all the conflicts were simple overlapping changes, nothing really interesting to report. Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5e: Support tagged tunnel over bondEli Britstein2019-06-071-5/+6
| | | | | | | | | | | | | | | | | | | | | | Stacked devices like bond interface may have a VLAN device on top of them. Detect lag state correctly under this condition, and return the correct routed net device, according to it the encap header is built. Fixes: e32ee6c78efa ("net/mlx5e: Support tunnel encap over tagged Ethernet") Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | net/mlx5e: Geneve, Add support for encap/decap flows offloadYevgeny Kliteynik2019-05-311-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add HW offloading support for flows with Geneve encap/decap. Notes about decap flows with Geneve TLV Options: - Support offloading of 32-bit options data only - At any given time, only one combination of class/type parameters can be offloaded, but the same class/type combination can have many different flows offloaded with different 32-bit option data - Options with value of 0 can't be offloaded Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | net/mlx5e: Rearrange tc tunnel code in a modular wayYevgeny Kliteynik2019-05-311-198/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rearrange tc tunnel code so that it would be easy to add future tunnels: - Define tc tunnel object with the fields and callbacks that any tunnel must implement. - Define tc UDP tunnel object for UDP tunnels, such as VXLAN - Move each tunnel code (GRE, VXLAN) to its own separate file - Rewrite tc tunnel implementation in a general way - using only the objects and their callbacks. Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | net/mlx5e: Geneve, Keep tunnel info as pointer to the original structYevgeny Kliteynik2019-05-311-7/+8
|/ | | | | | | | | | | | | | | | | In mlx5e encap entry structure, IP tunnel info data structure is copied by value. This approach worked till now, but it breaks when there are encapsulation options, such as in case of Geneve. These options are stored in the structure that is allocated adjacent to the IP tunnel info struct, and not pointed at by any field in that struct. Therefore, when copying the struct by value, we loose the address of the original struct and can't get to the encapsulation options. Fix the problem by storing the pointer to the tunnel info data instead. Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2019-04-171-0/+4
|\ | | | | | | | | | | Conflict resolution of af_smc.c from Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5e: Protect against non-uplink representor for encapDmytro Linkin2019-04-091-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | TC encap offload is supported only for the physical uplink representor. Fail for non uplink representor. Fixes: 3e621b19b0bb ("net/mlx5e: Support TC encapsulation offloads with upper devices") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>