diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-04-29 11:57:23 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-04-29 11:57:23 -0700 |
commit | 9d31d2338950293ec19d9b095fbaa9030899dcb4 (patch) | |
tree | e688040d0557c24a2eeb9f6c9c223d949f6f7ef9 /drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | |
parent | 635de956a7f5a6ffcb04f29d70630c64c717b56b (diff) | |
parent | 4a52dd8fefb45626dace70a63c0738dbd83b7edb (diff) | |
download | linux-stable-9d31d2338950293ec19d9b095fbaa9030899dcb4.tar.gz linux-stable-9d31d2338950293ec19d9b095fbaa9030899dcb4.tar.bz2 linux-stable-9d31d2338950293ec19d9b095fbaa9030899dcb4.zip |
Merge tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- bpf:
- allow bpf programs calling kernel functions (initially to
reuse TCP congestion control implementations)
- enable task local storage for tracing programs - remove the
need to store per-task state in hash maps, and allow tracing
programs access to task local storage previously added for
BPF_LSM
- add bpf_for_each_map_elem() helper, allowing programs to walk
all map elements in a more robust and easier to verify fashion
- sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT
redirection
- lpm: add support for batched ops in LPM trie
- add BTF_KIND_FLOAT support - mostly to allow use of BTF on
s390 which has floats in its headers files
- improve BPF syscall documentation and extend the use of kdoc
parsing scripts we already employ for bpf-helpers
- libbpf, bpftool: support static linking of BPF ELF files
- improve support for encapsulation of L2 packets
- xdp: restructure redirect actions to avoid a runtime lookup,
improving performance by 4-8% in microbenchmarks
- xsk: build skb by page (aka generic zerocopy xmit) - improve
performance of software AF_XDP path by 33% for devices which don't
need headers in the linear skb part (e.g. virtio)
- nexthop: resilient next-hop groups - improve path stability on
next-hops group changes (incl. offload for mlxsw)
- ipv6: segment routing: add support for IPv4 decapsulation
- icmp: add support for RFC 8335 extended PROBE messages
- inet: use bigger hash table for IP ID generation
- tcp: deal better with delayed TX completions - make sure we don't
give up on fast TCP retransmissions only because driver is slow in
reporting that it completed transmitting the original
- tcp: reorder tcp_congestion_ops for better cache locality
- mptcp:
- add sockopt support for common TCP options
- add support for common TCP msg flags
- include multiple address ids in RM_ADDR
- add reset option support for resetting one subflow
- udp: GRO L4 improvements - improve 'forward' / 'frag_list'
co-existence with UDP tunnel GRO, allowing the first to take place
correctly even for encapsulated UDP traffic
- micro-optimize dev_gro_receive() and flow dissection, avoid
retpoline overhead on VLAN and TEB GRO
- use less memory for sysctls, add a new sysctl type, to allow using
u8 instead of "int" and "long" and shrink networking sysctls
- veth: allow GRO without XDP - this allows aggregating UDP packets
before handing them off to routing, bridge, OvS, etc.
- allow specifing ifindex when device is moved to another namespace
- netfilter:
- nft_socket: add support for cgroupsv2
- nftables: add catch-all set element - special element used to
define a default action in case normal lookup missed
- use net_generic infra in many modules to avoid allocating
per-ns memory unnecessarily
- xps: improve the xps handling to avoid potential out-of-bound
accesses and use-after-free when XPS change race with other
re-configuration under traffic
- add a config knob to turn off per-cpu netdev refcnt to catch
underflows in testing
Device APIs:
- add WWAN subsystem to organize the WWAN interfaces better and
hopefully start driving towards more unified and vendor-
independent APIs
- ethtool:
- add interface for reading IEEE MIB stats (incl. mlx5 and bnxt
support)
- allow network drivers to dump arbitrary SFP EEPROM data,
current offset+length API was a poor fit for modern SFP which
define EEPROM in terms of pages (incl. mlx5 support)
- act_police, flow_offload: add support for packet-per-second
policing (incl. offload for nfp)
- psample: add additional metadata attributes like transit delay for
packets sampled from switch HW (and corresponding egress and
policy-based sampling in the mlxsw driver)
- dsa: improve support for sandwiched LAGs with bridge and DSA
- netfilter:
- flowtable: use direct xmit in topologies with IP forwarding,
bridging, vlans etc.
- nftables: counter hardware offload support
- Bluetooth:
- improvements for firmware download w/ Intel devices
- add support for reading AOSP vendor capabilities
- add support for virtio transport driver
- mac80211:
- allow concurrent monitor iface and ethernet rx decap
- set priority and queue mapping for injected frames
- phy: add support for Clause-45 PHY Loopback
- pci/iov: add sysfs MSI-X vector assignment interface to distribute
MSI-X resources to VFs (incl. mlx5 support)
New hardware/drivers:
- dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port
Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit
interfaces.
- dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and
BCM63xx switches
- Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches
- ath11k: support for QCN9074 a 802.11ax device
- Bluetooth: Broadcom BCM4330 and BMC4334
- phy: Marvell 88X2222 transceiver support
- mdio: add BCM6368 MDIO mux bus controller
- r8152: support RTL8153 and RTL8156 (USB Ethernet) chips
- mana: driver for Microsoft Azure Network Adapter (MANA)
- Actions Semi Owl Ethernet MAC
- can: driver for ETAS ES58X CAN/USB interfaces
Pure driver changes:
- add XDP support to: enetc, igc, stmmac
- add AF_XDP support to: stmmac
- virtio:
- page_to_skb() use build_skb when there's sufficient tailroom
(21% improvement for 1000B UDP frames)
- support XDP even without dedicated Tx queues - share the Tx
queues with the stack when necessary
- mlx5:
- flow rules: add support for mirroring with conntrack, matching
on ICMP, GTP, flex filters and more
- support packet sampling with flow offloads
- persist uplink representor netdev across eswitch mode changes
- allow coexistence of CQE compression and HW time-stamping
- add ethtool extended link error state reporting
- ice, iavf: support flow filters, UDP Segmentation Offload
- dpaa2-switch:
- move the driver out of staging
- add spanning tree (STP) support
- add rx copybreak support
- add tc flower hardware offload on ingress traffic
- ionic:
- implement Rx page reuse
- support HW PTP time-stamping
- octeon: support TC hardware offloads - flower matching on ingress
and egress ratelimitting.
- stmmac:
- add RX frame steering based on VLAN priority in tc flower
- support frame preemption (FPE)
- intel: add cross time-stamping freq difference adjustment
- ocelot:
- support forwarding of MRP frames in HW
- support multiple bridges
- support PTP Sync one-step timestamping
- dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like
learning, flooding etc.
- ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350,
SC7280 SoCs)
- mt7601u: enable TDLS support
- mt76:
- add support for 802.3 rx frames (mt7915/mt7615)
- mt7915 flash pre-calibration support
- mt7921/mt7663 runtime power management fixes"
* tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits)
net: selftest: fix build issue if INET is disabled
net: netrom: nr_in: Remove redundant assignment to ns
net: tun: Remove redundant assignment to ret
net: phy: marvell: add downshift support for M88E1240
net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255
net/sched: act_ct: Remove redundant ct get and check
icmp: standardize naming of RFC 8335 PROBE constants
bpf, selftests: Update array map tests for per-cpu batched ops
bpf: Add batched ops support for percpu array
bpf: Implement formatted output helpers with bstr_printf
seq_file: Add a seq_bprintf function
sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues
net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
net: fix a concurrency bug in l2tp_tunnel_register()
net/smc: Remove redundant assignment to rc
mpls: Remove redundant assignment to err
llc2: Remove redundant assignment to rc
net/tls: Remove redundant initialization of record
rds: Remove redundant assignment to nr_sig
dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0
...
Diffstat (limited to 'drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c')
-rw-r--r-- | drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 724 |
1 files changed, 394 insertions, 330 deletions
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index d4a2f8d1ee9f..db1e74280e57 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -40,7 +40,6 @@ #include "eswitch.h" #include "esw/indir_table.h" #include "esw/acl/ofld.h" -#include "esw/indir_table.h" #include "rdma.h" #include "en.h" #include "fs_core.h" @@ -48,6 +47,17 @@ #include "lib/eq.h" #include "lib/fs_chains.h" #include "en_tc.h" +#include "en/mapping.h" + +#define mlx5_esw_for_each_rep(esw, i, rep) \ + xa_for_each(&((esw)->offloads.vport_reps), i, rep) + +#define mlx5_esw_for_each_sf_rep(esw, i, rep) \ + xa_for_each_marked(&((esw)->offloads.vport_reps), i, rep, MLX5_ESW_VPT_SF) + +#define mlx5_esw_for_each_vf_rep(esw, index, rep) \ + mlx5_esw_for_each_entry_marked(&((esw)->offloads.vport_reps), index, \ + rep, (esw)->esw_funcs.num_vfs, MLX5_ESW_VPT_VF) /* There are two match-all miss flows, one for unicast dst mac and * one for multicast. @@ -55,192 +65,19 @@ #define MLX5_ESW_MISS_FLOWS (2) #define UPLINK_REP_INDEX 0 -/* Per vport tables */ - -#define MLX5_ESW_VPORT_TABLE_SIZE 128 - -/* This struct is used as a key to the hash table and we need it to be packed - * so hash result is consistent - */ -struct mlx5_vport_key { - u32 chain; - u16 prio; - u16 vport; - u16 vhca_id; -} __packed; - -struct mlx5_vport_tbl_attr { - u16 chain; - u16 prio; - u16 vport; -}; - -struct mlx5_vport_table { - struct hlist_node hlist; - struct mlx5_flow_table *fdb; - u32 num_rules; - struct mlx5_vport_key key; -}; - +#define MLX5_ESW_VPORT_TBL_SIZE 128 #define MLX5_ESW_VPORT_TBL_NUM_GROUPS 4 -static struct mlx5_flow_table * -esw_vport_tbl_create(struct mlx5_eswitch *esw, struct mlx5_flow_namespace *ns) -{ - struct mlx5_flow_table_attr ft_attr = {}; - struct mlx5_flow_table *fdb; - - ft_attr.autogroup.max_num_groups = MLX5_ESW_VPORT_TBL_NUM_GROUPS; - ft_attr.max_fte = MLX5_ESW_VPORT_TABLE_SIZE; - ft_attr.prio = FDB_PER_VPORT; - fdb = mlx5_create_auto_grouped_flow_table(ns, &ft_attr); - if (IS_ERR(fdb)) { - esw_warn(esw->dev, "Failed to create per vport FDB Table err %ld\n", - PTR_ERR(fdb)); - } - - return fdb; -} - -static u32 flow_attr_to_vport_key(struct mlx5_eswitch *esw, - struct mlx5_vport_tbl_attr *attr, - struct mlx5_vport_key *key) -{ - key->vport = attr->vport; - key->chain = attr->chain; - key->prio = attr->prio; - key->vhca_id = MLX5_CAP_GEN(esw->dev, vhca_id); - return jhash(key, sizeof(*key), 0); -} - -/* caller must hold vports.lock */ -static struct mlx5_vport_table * -esw_vport_tbl_lookup(struct mlx5_eswitch *esw, struct mlx5_vport_key *skey, u32 key) -{ - struct mlx5_vport_table *e; - - hash_for_each_possible(esw->fdb_table.offloads.vports.table, e, hlist, key) - if (!memcmp(&e->key, skey, sizeof(*skey))) - return e; - - return NULL; -} - -static void -esw_vport_tbl_put(struct mlx5_eswitch *esw, struct mlx5_vport_tbl_attr *attr) -{ - struct mlx5_vport_table *e; - struct mlx5_vport_key key; - u32 hkey; - - mutex_lock(&esw->fdb_table.offloads.vports.lock); - hkey = flow_attr_to_vport_key(esw, attr, &key); - e = esw_vport_tbl_lookup(esw, &key, hkey); - if (!e || --e->num_rules) - goto out; - - hash_del(&e->hlist); - mlx5_destroy_flow_table(e->fdb); - kfree(e); -out: - mutex_unlock(&esw->fdb_table.offloads.vports.lock); -} - -static struct mlx5_flow_table * -esw_vport_tbl_get(struct mlx5_eswitch *esw, struct mlx5_vport_tbl_attr *attr) -{ - struct mlx5_core_dev *dev = esw->dev; - struct mlx5_flow_namespace *ns; - struct mlx5_flow_table *fdb; - struct mlx5_vport_table *e; - struct mlx5_vport_key skey; - u32 hkey; - - mutex_lock(&esw->fdb_table.offloads.vports.lock); - hkey = flow_attr_to_vport_key(esw, attr, &skey); - e = esw_vport_tbl_lookup(esw, &skey, hkey); - if (e) { - e->num_rules++; - goto out; - } - - e = kzalloc(sizeof(*e), GFP_KERNEL); - if (!e) { - fdb = ERR_PTR(-ENOMEM); - goto err_alloc; - } - - ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_FDB); - if (!ns) { - esw_warn(dev, "Failed to get FDB namespace\n"); - fdb = ERR_PTR(-ENOENT); - goto err_ns; - } - - fdb = esw_vport_tbl_create(esw, ns); - if (IS_ERR(fdb)) - goto err_ns; - - e->fdb = fdb; - e->num_rules = 1; - e->key = skey; - hash_add(esw->fdb_table.offloads.vports.table, &e->hlist, hkey); -out: - mutex_unlock(&esw->fdb_table.offloads.vports.lock); - return e->fdb; - -err_ns: - kfree(e); -err_alloc: - mutex_unlock(&esw->fdb_table.offloads.vports.lock); - return fdb; -} - -int mlx5_esw_vport_tbl_get(struct mlx5_eswitch *esw) -{ - struct mlx5_vport_tbl_attr attr; - struct mlx5_flow_table *fdb; - struct mlx5_vport *vport; - int i; - - attr.chain = 0; - attr.prio = 1; - mlx5_esw_for_all_vports(esw, i, vport) { - attr.vport = vport->vport; - fdb = esw_vport_tbl_get(esw, &attr); - if (IS_ERR(fdb)) - goto out; - } - return 0; - -out: - mlx5_esw_vport_tbl_put(esw); - return PTR_ERR(fdb); -} - -void mlx5_esw_vport_tbl_put(struct mlx5_eswitch *esw) -{ - struct mlx5_vport_tbl_attr attr; - struct mlx5_vport *vport; - int i; - - attr.chain = 0; - attr.prio = 1; - mlx5_esw_for_all_vports(esw, i, vport) { - attr.vport = vport->vport; - esw_vport_tbl_put(esw, &attr); - } -} - -/* End: Per vport tables */ +static const struct esw_vport_tbl_namespace mlx5_esw_vport_tbl_mirror_ns = { + .max_fte = MLX5_ESW_VPORT_TBL_SIZE, + .max_num_groups = MLX5_ESW_VPORT_TBL_NUM_GROUPS, + .flags = 0, +}; static struct mlx5_eswitch_rep *mlx5_eswitch_get_rep(struct mlx5_eswitch *esw, u16 vport_num) { - int idx = mlx5_eswitch_vport_num_to_index(esw, vport_num); - - WARN_ON(idx > esw->total_vports - 1); - return &esw->offloads.vport_reps[idx]; + return xa_load(&esw->offloads.vport_reps, vport_num); } static void @@ -256,6 +93,26 @@ mlx5_eswitch_set_rule_flow_source(struct mlx5_eswitch *esw, MLX5_FLOW_CONTEXT_FLOW_SOURCE_LOCAL_VPORT; } +/* Actually only the upper 16 bits of reg c0 need to be cleared, but the lower 16 bits + * are not needed as well in the following process. So clear them all for simplicity. + */ +void +mlx5_eswitch_clear_rule_source_port(struct mlx5_eswitch *esw, struct mlx5_flow_spec *spec) +{ + if (mlx5_eswitch_vport_match_metadata_enabled(esw)) { + void *misc2; + + misc2 = MLX5_ADDR_OF(fte_match_param, spec->match_value, misc_parameters_2); + MLX5_SET(fte_match_set_misc2, misc2, metadata_reg_c_0, 0); + + misc2 = MLX5_ADDR_OF(fte_match_param, spec->match_criteria, misc_parameters_2); + MLX5_SET(fte_match_set_misc2, misc2, metadata_reg_c_0, 0); + + if (!memchr_inv(misc2, 0, MLX5_ST_SZ_BYTES(fte_match_set_misc2))) + spec->match_criteria_enable &= ~MLX5_MATCH_MISC_PARAMETERS_2; + } +} + static void mlx5_eswitch_set_rule_source_port(struct mlx5_eswitch *esw, struct mlx5_flow_spec *spec, @@ -327,6 +184,19 @@ esw_cleanup_decap_indir(struct mlx5_eswitch *esw, } static int +esw_setup_sampler_dest(struct mlx5_flow_destination *dest, + struct mlx5_flow_act *flow_act, + struct mlx5_esw_flow_attr *esw_attr, + int i) +{ + flow_act->flags |= FLOW_ACT_IGNORE_FLOW_LEVEL; + dest[i].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_SAMPLER; + dest[i].sampler_id = esw_attr->sample->sampler_id; + + return 0; +} + +static int esw_setup_ft_dest(struct mlx5_flow_destination *dest, struct mlx5_flow_act *flow_act, struct mlx5_eswitch *esw, @@ -561,7 +431,10 @@ esw_setup_dests(struct mlx5_flow_destination *dest, esw_src_port_rewrite_supported(esw)) attr->flags |= MLX5_ESW_ATTR_FLAG_SRC_REWRITE; - if (attr->dest_ft) { + if (attr->flags & MLX5_ESW_ATTR_FLAG_SAMPLE) { + esw_setup_sampler_dest(dest, flow_act, esw_attr, *i); + (*i)++; + } else if (attr->dest_ft) { esw_setup_ft_dest(dest, flow_act, esw, attr, spec, *i); (*i)++; } else if (attr->flags & MLX5_ESW_ATTR_FLAG_SLOW_PATH) { @@ -664,12 +537,16 @@ mlx5_eswitch_add_offloaded_rule(struct mlx5_eswitch *esw, if (flow_act.action & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR) flow_act.modify_hdr = attr->modify_hdr; - if (split) { + /* esw_attr->sample is allocated only when there is a sample action */ + if (esw_attr->sample && esw_attr->sample->sample_default_tbl) { + fdb = esw_attr->sample->sample_default_tbl; + } else if (split) { fwd_attr.chain = attr->chain; fwd_attr.prio = attr->prio; fwd_attr.vport = esw_attr->in_rep->vport; + fwd_attr.vport_ns = &mlx5_esw_vport_tbl_mirror_ns; - fdb = esw_vport_tbl_get(esw, &fwd_attr); + fdb = mlx5_esw_vporttbl_get(esw, &fwd_attr); } else { if (attr->chain || attr->prio) fdb = mlx5_chains_get_table(chains, attr->chain, @@ -701,7 +578,7 @@ mlx5_eswitch_add_offloaded_rule(struct mlx5_eswitch *esw, err_add_rule: if (split) - esw_vport_tbl_put(esw, &fwd_attr); + mlx5_esw_vporttbl_put(esw, &fwd_attr); else if (attr->chain || attr->prio) mlx5_chains_put_table(chains, attr->chain, attr->prio, 0); err_esw_get: @@ -734,7 +611,8 @@ mlx5_eswitch_add_fwd_rule(struct mlx5_eswitch *esw, fwd_attr.chain = attr->chain; fwd_attr.prio = attr->prio; fwd_attr.vport = esw_attr->in_rep->vport; - fwd_fdb = esw_vport_tbl_get(esw, &fwd_attr); + fwd_attr.vport_ns = &mlx5_esw_vport_tbl_mirror_ns; + fwd_fdb = mlx5_esw_vporttbl_get(esw, &fwd_attr); if (IS_ERR(fwd_fdb)) { rule = ERR_CAST(fwd_fdb); goto err_get_fwd; @@ -779,7 +657,7 @@ mlx5_eswitch_add_fwd_rule(struct mlx5_eswitch *esw, return rule; err_chain_src_rewrite: esw_put_dest_tables_loop(esw, attr, 0, i); - esw_vport_tbl_put(esw, &fwd_attr); + mlx5_esw_vporttbl_put(esw, &fwd_attr); err_get_fwd: mlx5_chains_put_table(chains, attr->chain, attr->prio, 0); err_get_fast: @@ -814,15 +692,16 @@ __mlx5_eswitch_del_rule(struct mlx5_eswitch *esw, fwd_attr.chain = attr->chain; fwd_attr.prio = attr->prio; fwd_attr.vport = esw_attr->in_rep->vport; + fwd_attr.vport_ns = &mlx5_esw_vport_tbl_mirror_ns; } if (fwd_rule) { - esw_vport_tbl_put(esw, &fwd_attr); + mlx5_esw_vporttbl_put(esw, &fwd_attr); mlx5_chains_put_table(chains, attr->chain, attr->prio, 0); esw_put_dest_tables_loop(esw, attr, 0, esw_attr->split_count); } else { if (split) - esw_vport_tbl_put(esw, &fwd_attr); + mlx5_esw_vporttbl_put(esw, &fwd_attr); else if (attr->chain || attr->prio) mlx5_chains_put_table(chains, attr->chain, attr->prio, 0); esw_cleanup_dests(esw, attr); @@ -848,10 +727,11 @@ mlx5_eswitch_del_fwd_rule(struct mlx5_eswitch *esw, static int esw_set_global_vlan_pop(struct mlx5_eswitch *esw, u8 val) { struct mlx5_eswitch_rep *rep; - int i, err = 0; + unsigned long i; + int err = 0; esw_debug(esw->dev, "%s applying global %s policy\n", __func__, val ? "pop" : "none"); - mlx5_esw_for_each_host_func_rep(esw, i, rep, esw->esw_funcs.num_vfs) { + mlx5_esw_for_each_host_func_vport(esw, i, rep, esw->esw_funcs.num_vfs) { if (atomic_read(&rep->rep_data[REP_ETH].state) != REP_LOADED) continue; @@ -1043,7 +923,8 @@ out: } struct mlx5_flow_handle * -mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *esw, u16 vport, +mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *on_esw, + struct mlx5_eswitch_rep *rep, u32 sqn) { struct mlx5_flow_act flow_act = {0}; @@ -1061,21 +942,30 @@ mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *esw, u16 vport, misc = MLX5_ADDR_OF(fte_match_param, spec->match_value, misc_parameters); MLX5_SET(fte_match_set_misc, misc, source_sqn, sqn); /* source vport is the esw manager */ - MLX5_SET(fte_match_set_misc, misc, source_port, esw->manager_vport); + MLX5_SET(fte_match_set_misc, misc, source_port, rep->esw->manager_vport); + if (MLX5_CAP_ESW(on_esw->dev, merged_eswitch)) + MLX5_SET(fte_match_set_misc, misc, source_eswitch_owner_vhca_id, + MLX5_CAP_GEN(rep->esw->dev, vhca_id)); misc = MLX5_ADDR_OF(fte_match_param, spec->match_criteria, misc_parameters); MLX5_SET_TO_ONES(fte_match_set_misc, misc, source_sqn); MLX5_SET_TO_ONES(fte_match_set_misc, misc, source_port); + if (MLX5_CAP_ESW(on_esw->dev, merged_eswitch)) + MLX5_SET_TO_ONES(fte_match_set_misc, misc, + source_eswitch_owner_vhca_id); spec->match_criteria_enable = MLX5_MATCH_MISC_PARAMETERS; dest.type = MLX5_FLOW_DESTINATION_TYPE_VPORT; - dest.vport.num = vport; + dest.vport.num = rep->vport; + dest.vport.vhca_id = MLX5_CAP_GEN(rep->esw->dev, vhca_id); + dest.vport.flags |= MLX5_FLOW_DEST_VPORT_VHCA_ID; flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; - flow_rule = mlx5_add_flow_rules(esw->fdb_table.offloads.slow_fdb, + flow_rule = mlx5_add_flow_rules(on_esw->fdb_table.offloads.slow_fdb, spec, &flow_act, &dest, 1); if (IS_ERR(flow_rule)) - esw_warn(esw->dev, "FDB: Failed to add send to vport rule err %ld\n", PTR_ERR(flow_rule)); + esw_warn(on_esw->dev, "FDB: Failed to add send to vport rule err %ld\n", + PTR_ERR(flow_rule)); out: kvfree(spec); return flow_rule; @@ -1090,13 +980,13 @@ void mlx5_eswitch_del_send_to_vport_rule(struct mlx5_flow_handle *rule) static void mlx5_eswitch_del_send_to_vport_meta_rules(struct mlx5_eswitch *esw) { struct mlx5_flow_handle **flows = esw->fdb_table.offloads.send_to_vport_meta_rules; - int i = 0, num_vfs = esw->esw_funcs.num_vfs, vport_num; + int i = 0, num_vfs = esw->esw_funcs.num_vfs; if (!num_vfs || !flows) return; - mlx5_esw_for_each_vf_vport_num(esw, vport_num, num_vfs) - mlx5_del_flow_rules(flows[i++]); + for (i = 0; i < num_vfs; i++) + mlx5_del_flow_rules(flows[i]); kvfree(flows); } @@ -1104,12 +994,15 @@ static void mlx5_eswitch_del_send_to_vport_meta_rules(struct mlx5_eswitch *esw) static int mlx5_eswitch_add_send_to_vport_meta_rules(struct mlx5_eswitch *esw) { - int num_vfs, vport_num, rule_idx = 0, err = 0; struct mlx5_flow_destination dest = {}; struct mlx5_flow_act flow_act = {0}; + int num_vfs, rule_idx = 0, err = 0; struct mlx5_flow_handle *flow_rule; struct mlx5_flow_handle **flows; struct mlx5_flow_spec *spec; + struct mlx5_vport *vport; + unsigned long i; + u16 vport_num; num_vfs = esw->esw_funcs.num_vfs; flows = kvzalloc(num_vfs * sizeof(*flows), GFP_KERNEL); @@ -1133,7 +1026,8 @@ mlx5_eswitch_add_send_to_vport_meta_rules(struct mlx5_eswitch *esw) dest.type = MLX5_FLOW_DESTINATION_TYPE_VPORT; flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; - mlx5_esw_for_each_vf_vport_num(esw, vport_num, num_vfs) { + mlx5_esw_for_each_vf_vport(esw, i, vport, num_vfs) { + vport_num = vport->vport; MLX5_SET(fte_match_param, spec->match_value, misc_parameters_2.metadata_reg_c_0, mlx5_eswitch_get_vport_metadata_for_match(esw, vport_num)); dest.vport.num = vport_num; @@ -1275,12 +1169,14 @@ static int esw_add_fdb_peer_miss_rules(struct mlx5_eswitch *esw, struct mlx5_flow_destination dest = {}; struct mlx5_flow_act flow_act = {0}; struct mlx5_flow_handle **flows; - struct mlx5_flow_handle *flow; - struct mlx5_flow_spec *spec; /* total vports is the same for both e-switches */ int nvports = esw->total_vports; + struct mlx5_flow_handle *flow; + struct mlx5_flow_spec *spec; + struct mlx5_vport *vport; + unsigned long i; void *misc; - int err, i; + int err; spec = kvzalloc(sizeof(*spec), GFP_KERNEL); if (!spec) @@ -1299,6 +1195,7 @@ static int esw_add_fdb_peer_miss_rules(struct mlx5_eswitch *esw, misc_parameters); if (mlx5_core_is_ecpf_esw_manager(esw->dev)) { + vport = mlx5_eswitch_get_vport(esw, MLX5_VPORT_PF); esw_set_peer_miss_rule_source_port(esw, peer_dev->priv.eswitch, spec, MLX5_VPORT_PF); @@ -1308,10 +1205,11 @@ static int esw_add_fdb_peer_miss_rules(struct mlx5_eswitch *esw, err = PTR_ERR(flow); goto add_pf_flow_err; } - flows[MLX5_VPORT_PF] = flow; + flows[vport->index] = flow; } if (mlx5_ecpf_vport_exists(esw->dev)) { + vport = mlx5_eswitch_get_vport(esw, MLX5_VPORT_ECPF); MLX5_SET(fte_match_set_misc, misc, source_port, MLX5_VPORT_ECPF); flow = mlx5_add_flow_rules(esw->fdb_table.offloads.slow_fdb, spec, &flow_act, &dest, 1); @@ -1319,13 +1217,13 @@ static int esw_add_fdb_peer_miss_rules(struct mlx5_eswitch *esw, err = PTR_ERR(flow); goto add_ecpf_flow_err; } - flows[mlx5_eswitch_ecpf_idx(esw)] = flow; + flows[vport->index] = flow; } - mlx5_esw_for_each_vf_vport_num(esw, i, mlx5_core_max_vfs(esw->dev)) { + mlx5_esw_for_each_vf_vport(esw, i, vport, mlx5_core_max_vfs(esw->dev)) { esw_set_peer_miss_rule_source_port(esw, peer_dev->priv.eswitch, - spec, i); + spec, vport->vport); flow = mlx5_add_flow_rules(esw->fdb_table.offloads.slow_fdb, spec, &flow_act, &dest, 1); @@ -1333,7 +1231,7 @@ static int esw_add_fdb_peer_miss_rules(struct mlx5_eswitch *esw, err = PTR_ERR(flow); goto add_vf_flow_err; } - flows[i] = flow; + flows[vport->index] = flow; } esw->fdb_table.offloads.peer_miss_rules = flows; @@ -1342,15 +1240,20 @@ static int esw_add_fdb_peer_miss_rules(struct mlx5_eswitch *esw, return 0; add_vf_flow_err: - nvports = --i; - mlx5_esw_for_each_vf_vport_num_reverse(esw, i, nvports) - mlx5_del_flow_rules(flows[i]); - - if (mlx5_ecpf_vport_exists(esw->dev)) - mlx5_del_flow_rules(flows[mlx5_eswitch_ecpf_idx(esw)]); + mlx5_esw_for_each_vf_vport(esw, i, vport, mlx5_core_max_vfs(esw->dev)) { + if (!flows[vport->index]) + continue; + mlx5_del_flow_rules(flows[vport->index]); + } + if (mlx5_ecpf_vport_exists(esw->dev)) { + vport = mlx5_eswitch_get_vport(esw, MLX5_VPORT_ECPF); + mlx5_del_flow_rules(flows[vport->index]); + } add_ecpf_flow_err: - if (mlx5_core_is_ecpf_esw_manager(esw->dev)) - mlx5_del_flow_rules(flows[MLX5_VPORT_PF]); + if (mlx5_core_is_ecpf_esw_manager(esw->dev)) { + vport = mlx5_eswitch_get_vport(esw, MLX5_VPORT_PF); + mlx5_del_flow_rules(flows[vport->index]); + } add_pf_flow_err: esw_warn(esw->dev, "FDB: Failed to add peer miss flow rule err %d\n", err); kvfree(flows); @@ -1362,20 +1265,23 @@ alloc_flows_err: static void esw_del_fdb_peer_miss_rules(struct mlx5_eswitch *esw) { struct mlx5_flow_handle **flows; - int i; + struct mlx5_vport *vport; + unsigned long i; flows = esw->fdb_table.offloads.peer_miss_rules; - mlx5_esw_for_each_vf_vport_num_reverse(esw, i, - mlx5_core_max_vfs(esw->dev)) - mlx5_del_flow_rules(flows[i]); + mlx5_esw_for_each_vf_vport(esw, i, vport, mlx5_core_max_vfs(esw->dev)) + mlx5_del_flow_rules(flows[vport->index]); - if (mlx5_ecpf_vport_exists(esw->dev)) - mlx5_del_flow_rules(flows[mlx5_eswitch_ecpf_idx(esw)]); - - if (mlx5_core_is_ecpf_esw_manager(esw->dev)) - mlx5_del_flow_rules(flows[MLX5_VPORT_PF]); + if (mlx5_ecpf_vport_exists(esw->dev)) { + vport = mlx5_eswitch_get_vport(esw, MLX5_VPORT_ECPF); + mlx5_del_flow_rules(flows[vport->index]); + } + if (mlx5_core_is_ecpf_esw_manager(esw->dev)) { + vport = mlx5_eswitch_get_vport(esw, MLX5_VPORT_PF); + mlx5_del_flow_rules(flows[vport->index]); + } kvfree(flows); } @@ -1453,14 +1359,14 @@ esw_add_restore_rule(struct mlx5_eswitch *esw, u32 tag) if (!mlx5_eswitch_reg_c1_loopback_supported(esw)) return ERR_PTR(-EOPNOTSUPP); - spec = kzalloc(sizeof(*spec), GFP_KERNEL); + spec = kvzalloc(sizeof(*spec), GFP_KERNEL); if (!spec) return ERR_PTR(-ENOMEM); misc = MLX5_ADDR_OF(fte_match_param, spec->match_criteria, misc_parameters_2); MLX5_SET(fte_match_set_misc2, misc, metadata_reg_c_0, - ESW_CHAIN_TAG_METADATA_MASK); + ESW_REG_C0_USER_DATA_METADATA_MASK); misc = MLX5_ADDR_OF(fte_match_param, spec->match_value, misc_parameters_2); MLX5_SET(fte_match_set_misc2, misc, metadata_reg_c_0, tag); @@ -1476,7 +1382,7 @@ esw_add_restore_rule(struct mlx5_eswitch *esw, u32 tag) dest.ft = esw->offloads.ft_offloads; flow_rule = mlx5_add_flow_rules(ft, spec, &flow_act, &dest, 1); - kfree(spec); + kvfree(spec); if (IS_ERR(flow_rule)) esw_warn(esw->dev, @@ -1486,12 +1392,6 @@ esw_add_restore_rule(struct mlx5_eswitch *esw, u32 tag) return flow_rule; } -u32 -esw_get_max_restore_tag(struct mlx5_eswitch *esw) -{ - return ESW_CHAIN_TAG_METADATA_MASK; -} - #define MAX_PF_SQ 256 #define MAX_SQ_NVPORTS 32 @@ -1521,6 +1421,44 @@ static void esw_set_flow_group_source_port(struct mlx5_eswitch *esw, } #if IS_ENABLED(CONFIG_MLX5_CLS_ACT) +static void esw_vport_tbl_put(struct mlx5_eswitch *esw) +{ + struct mlx5_vport_tbl_attr attr; + struct mlx5_vport *vport; + unsigned long i; + + attr.chain = 0; + attr.prio = 1; + mlx5_esw_for_each_vport(esw, i, vport) { + attr.vport = vport->vport; + attr.vport_ns = &mlx5_esw_vport_tbl_mirror_ns; + mlx5_esw_vporttbl_put(esw, &attr); + } +} + +static int esw_vport_tbl_get(struct mlx5_eswitch *esw) +{ + struct mlx5_vport_tbl_attr attr; + struct mlx5_flow_table *fdb; + struct mlx5_vport *vport; + unsigned long i; + + attr.chain = 0; + attr.prio = 1; + mlx5_esw_for_each_vport(esw, i, vport) { + attr.vport = vport->vport; + attr.vport_ns = &mlx5_esw_vport_tbl_mirror_ns; + fdb = mlx5_esw_vporttbl_get(esw, &attr); + if (IS_ERR(fdb)) + goto out; + } + return 0; + +out: + esw_vport_tbl_put(esw); + return PTR_ERR(fdb); +} + #define fdb_modify_header_fwd_to_table_supported(esw) \ (MLX5_CAP_ESW_FLOWTABLE((esw)->dev, fdb_modify_header_fwd_to_table)) static void esw_init_chains_offload_flags(struct mlx5_eswitch *esw, u32 *flags) @@ -1570,7 +1508,7 @@ esw_chains_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *miss_fdb) attr.max_ft_sz = fdb_max; attr.max_grp_num = esw->params.large_group_num; attr.default_ft = miss_fdb; - attr.max_restore_tag = esw_get_max_restore_tag(esw); + attr.mapping = esw->offloads.reg_c0_obj_pool; chains = mlx5_chains_create(dev, &attr); if (IS_ERR(chains)) { @@ -1598,7 +1536,7 @@ esw_chains_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *miss_fdb) /* Open level 1 for split fdb rules now if prios isn't supported */ if (!mlx5_chains_prios_supported(chains)) { - err = mlx5_esw_vport_tbl_get(esw); + err = esw_vport_tbl_get(esw); if (err) goto level_1_err; } @@ -1622,7 +1560,7 @@ static void esw_chains_destroy(struct mlx5_eswitch *esw, struct mlx5_fs_chains *chains) { if (!mlx5_chains_prios_supported(chains)) - mlx5_esw_vport_tbl_put(esw); + esw_vport_tbl_put(esw); mlx5_chains_put_table(chains, 0, 1, 0); mlx5_chains_put_table(chains, mlx5_chains_get_nf_ft_chain(chains), 1, 0); mlx5_chains_destroy(chains); @@ -1709,6 +1647,12 @@ static int esw_create_offloads_fdb_tables(struct mlx5_eswitch *esw) MLX5_SET_TO_ONES(fte_match_param, match_criteria, misc_parameters.source_sqn); MLX5_SET_TO_ONES(fte_match_param, match_criteria, misc_parameters.source_port); + if (MLX5_CAP_ESW(esw->dev, merged_eswitch)) { + MLX5_SET_TO_ONES(fte_match_param, match_criteria, + misc_parameters.source_eswitch_owner_vhca_id); + MLX5_SET(create_flow_group_in, flow_group_in, + source_eswitch_owner_vhca_id_valid, 1); + } ix = esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ; MLX5_SET(create_flow_group_in, flow_group_in, start_flow_index, 0); @@ -1865,6 +1809,7 @@ static void esw_destroy_offloads_fdb_tables(struct mlx5_eswitch *esw) /* Holds true only as long as DMFS is the default */ mlx5_flow_namespace_set_mode(esw->fdb_table.offloads.ns, MLX5_FLOW_STEERING_MODE_DMFS); + atomic64_set(&esw->user_count, 0); } static int esw_create_offloads_table(struct mlx5_eswitch *esw) @@ -1988,12 +1933,12 @@ out: return flow_rule; } - -static int mlx5_eswitch_inline_mode_get(const struct mlx5_eswitch *esw, u8 *mode) +static int mlx5_eswitch_inline_mode_get(struct mlx5_eswitch *esw, u8 *mode) { u8 prev_mlx5_mode, mlx5_mode = MLX5_INLINE_MODE_L2; struct mlx5_core_dev *dev = esw->dev; - int vport; + struct mlx5_vport *vport; + unsigned long i; if (!MLX5_CAP_GEN(dev, vport_group_manager)) return -EOPNOTSUPP; @@ -2014,8 +1959,8 @@ static int mlx5_eswitch_inline_mode_get(const struct mlx5_eswitch *esw, u8 *mode query_vports: mlx5_query_nic_vport_min_inline(dev, esw->first_host_vport, &prev_mlx5_mode); - mlx5_esw_for_each_host_func_vport(esw, vport, esw->esw_funcs.num_vfs) { - mlx5_query_nic_vport_min_inline(dev, vport, &mlx5_mode); + mlx5_esw_for_each_host_func_vport(esw, i, vport, esw->esw_funcs.num_vfs) { + mlx5_query_nic_vport_min_inline(dev, vport->vport, &mlx5_mode); if (prev_mlx5_mode != mlx5_mode) return -EINVAL; prev_mlx5_mode = mlx5_mode; @@ -2067,7 +2012,7 @@ static int esw_create_restore_table(struct mlx5_eswitch *esw) goto out_free; } - ft_attr.max_fte = 1 << ESW_CHAIN_TAG_METADATA_BITS; + ft_attr.max_fte = 1 << ESW_REG_C0_USER_DATA_METADATA_BITS; ft = mlx5_create_flow_table(ns, &ft_attr); if (IS_ERR(ft)) { err = PTR_ERR(ft); @@ -2082,7 +2027,7 @@ static int esw_create_restore_table(struct mlx5_eswitch *esw) misc_parameters_2); MLX5_SET(fte_match_set_misc2, misc, metadata_reg_c_0, - ESW_CHAIN_TAG_METADATA_MASK); + ESW_REG_C0_USER_DATA_METADATA_MASK); MLX5_SET(create_flow_group_in, flow_group_in, start_flow_index, 0); MLX5_SET(create_flow_group_in, flow_group_in, end_flow_index, ft_attr.max_fte - 1); @@ -2158,34 +2103,82 @@ static int esw_offloads_start(struct mlx5_eswitch *esw, return err; } -void esw_offloads_cleanup_reps(struct mlx5_eswitch *esw) +static void mlx5_esw_offloads_rep_mark_set(struct mlx5_eswitch *esw, + struct mlx5_eswitch_rep *rep, + xa_mark_t mark) { - kfree(esw->offloads.vport_reps); + bool mark_set; + + /* Copy the mark from vport to its rep */ + mark_set = xa_get_mark(&esw->vports, rep->vport, mark); + if (mark_set) + xa_set_mark(&esw->offloads.vport_reps, rep->vport, mark); } -int esw_offloads_init_reps(struct mlx5_eswitch *esw) +static int mlx5_esw_offloads_rep_init(struct mlx5_eswitch *esw, const struct mlx5_vport *vport) { - int total_vports = esw->total_vports; struct mlx5_eswitch_rep *rep; - int vport_index; - u8 rep_type; + int rep_type; + int err; - esw->offloads.vport_reps = kcalloc(total_vports, - sizeof(struct mlx5_eswitch_rep), - GFP_KERNEL); - if (!esw->offloads.vport_reps) + rep = kzalloc(sizeof(*rep), GFP_KERNEL); + if (!rep) return -ENOMEM; - mlx5_esw_for_all_reps(esw, vport_index, rep) { - rep->vport = mlx5_eswitch_index_to_vport_num(esw, vport_index); - rep->vport_index = vport_index; + rep->vport = vport->vport; + rep->vport_index = vport->index; + for (rep_type = 0; rep_type < NUM_REP_TYPES; rep_type++) + atomic_set(&rep->rep_data[rep_type].state, REP_UNREGISTERED); - for (rep_type = 0; rep_type < NUM_REP_TYPES; rep_type++) - atomic_set(&rep->rep_data[rep_type].state, - REP_UNREGISTERED); - } + err = xa_insert(&esw->offloads.vport_reps, rep->vport, rep, GFP_KERNEL); + if (err) + goto insert_err; + mlx5_esw_offloads_rep_mark_set(esw, rep, MLX5_ESW_VPT_HOST_FN); + mlx5_esw_offloads_rep_mark_set(esw, rep, MLX5_ESW_VPT_VF); + mlx5_esw_offloads_rep_mark_set(esw, rep, MLX5_ESW_VPT_SF); return 0; + +insert_err: + kfree(rep); + return err; +} + +static void mlx5_esw_offloads_rep_cleanup(struct mlx5_eswitch *esw, + struct mlx5_eswitch_rep *rep) +{ + xa_erase(&esw->offloads.vport_reps, rep->vport); + kfree(rep); +} + +void esw_offloads_cleanup_reps(struct mlx5_eswitch *esw) +{ + struct mlx5_eswitch_rep *rep; + unsigned long i; + + mlx5_esw_for_each_rep(esw, i, rep) + mlx5_esw_offloads_rep_cleanup(esw, rep); + xa_destroy(&esw->offloads.vport_reps); +} + +int esw_offloads_init_reps(struct mlx5_eswitch *esw) +{ + struct mlx5_vport *vport; + unsigned long i; + int err; + + xa_init(&esw->offloads.vport_reps); + + mlx5_esw_for_each_vport(esw, i, vport) { + err = mlx5_esw_offloads_rep_init(esw, vport); + if (err) + goto err; + } + return 0; + +err: + esw_offloads_cleanup_reps(esw); + return err; } static void __esw_offloads_unload_rep(struct mlx5_eswitch *esw, @@ -2199,7 +2192,7 @@ static void __esw_offloads_unload_rep(struct mlx5_eswitch *esw, static void __unload_reps_sf_vport(struct mlx5_eswitch *esw, u8 rep_type) { struct mlx5_eswitch_rep *rep; - int i; + unsigned long i; mlx5_esw_for_each_sf_rep(esw, i, rep) __esw_offloads_unload_rep(esw, rep, rep_type); @@ -2208,11 +2201,11 @@ static void __unload_reps_sf_vport(struct mlx5_eswitch *esw, u8 rep_type) static void __unload_reps_all_vport(struct mlx5_eswitch *esw, u8 rep_type) { struct mlx5_eswitch_rep *rep; - int i; + unsigned long i; __unload_reps_sf_vport(esw, rep_type); - mlx5_esw_for_each_vf_rep_reverse(esw, i, rep, esw->esw_funcs.num_vfs) + mlx5_esw_for_each_vf_rep(esw, i, rep) __esw_offloads_unload_rep(esw, rep, rep_type); if (mlx5_ecpf_vport_exists(esw->dev)) { @@ -2270,9 +2263,11 @@ int esw_offloads_load_rep(struct mlx5_eswitch *esw, u16 vport_num) if (esw->mode != MLX5_ESWITCH_OFFLOADS) return 0; - err = mlx5_esw_offloads_devlink_port_register(esw, vport_num); - if (err) - return err; + if (vport_num != MLX5_VPORT_UPLINK) { + err = mlx5_esw_offloads_devlink_port_register(esw, vport_num); + if (err) + return err; + } err = mlx5_esw_offloads_rep_load(esw, vport_num); if (err) @@ -2280,7 +2275,8 @@ int esw_offloads_load_rep(struct mlx5_eswitch *esw, u16 vport_num) return err; load_err: - mlx5_esw_offloads_devlink_port_unregister(esw, vport_num); + if (vport_num != MLX5_VPORT_UPLINK) + mlx5_esw_offloads_devlink_port_unregister(esw, vport_num); return err; } @@ -2290,7 +2286,9 @@ void esw_offloads_unload_rep(struct mlx5_eswitch *esw, u16 vport_num) return; mlx5_esw_offloads_rep_unload(esw, vport_num); - mlx5_esw_offloads_devlink_port_unregister(esw, vport_num); + + if (vport_num != MLX5_VPORT_UPLINK) + mlx5_esw_offloads_devlink_port_unregister(esw, vport_num); } #define ESW_OFFLOADS_DEVCOM_PAIR (0) @@ -2299,13 +2297,8 @@ void esw_offloads_unload_rep(struct mlx5_eswitch *esw, u16 vport_num) static int mlx5_esw_offloads_pair(struct mlx5_eswitch *esw, struct mlx5_eswitch *peer_esw) { - int err; - - err = esw_add_fdb_peer_miss_rules(esw, peer_esw->dev); - if (err) - return err; - return 0; + return esw_add_fdb_peer_miss_rules(esw, peer_esw->dev); } static void mlx5_esw_offloads_unpair(struct mlx5_eswitch *esw) @@ -2430,8 +2423,7 @@ static void esw_offloads_devcom_cleanup(struct mlx5_eswitch *esw) mlx5_devcom_unregister_component(devcom, MLX5_DEVCOM_ESW_OFFLOADS); } -static bool -esw_check_vport_match_metadata_supported(const struct mlx5_eswitch *esw) +bool mlx5_esw_vport_match_metadata_supported(const struct mlx5_eswitch *esw) { if (!MLX5_CAP_ESW(esw->dev, esw_uplink_ingress_acl)) return false; @@ -2500,25 +2492,25 @@ static void esw_offloads_vport_metadata_cleanup(struct mlx5_eswitch *esw, static void esw_offloads_metadata_uninit(struct mlx5_eswitch *esw) { struct mlx5_vport *vport; - int i; + unsigned long i; if (!mlx5_eswitch_vport_match_metadata_enabled(esw)) return; - mlx5_esw_for_all_vports_reverse(esw, i, vport) + mlx5_esw_for_each_vport(esw, i, vport) esw_offloads_vport_metadata_cleanup(esw, vport); } static int esw_offloads_metadata_init(struct mlx5_eswitch *esw) { struct mlx5_vport *vport; + unsigned long i; int err; - int i; if (!mlx5_eswitch_vport_match_metadata_enabled(esw)) return 0; - mlx5_esw_for_all_vports(esw, i, vport) { + mlx5_esw_for_each_vport(esw, i, vport) { err = esw_offloads_vport_metadata_setup(esw, vport); if (err) goto metadata_err; @@ -2531,6 +2523,28 @@ metadata_err: return err; } +int mlx5_esw_offloads_vport_metadata_set(struct mlx5_eswitch *esw, bool enable) +{ + int err = 0; + + down_write(&esw->mode_lock); + if (esw->mode != MLX5_ESWITCH_NONE) { + err = -EBUSY; + goto done; + } + if (!mlx5_esw_vport_match_metadata_supported(esw)) { + err = -EOPNOTSUPP; + goto done; + } + if (enable) + esw->flags |= MLX5_ESWITCH_VPORT_MATCH_METADATA; + else + esw->flags &= ~MLX5_ESWITCH_VPORT_MATCH_METADATA; +done: + up_write(&esw->mode_lock); + return err; +} + int esw_vport_create_offloads_acl_tables(struct mlx5_eswitch *esw, struct mlx5_vport *vport) @@ -2565,6 +2579,9 @@ static int esw_create_uplink_offloads_acl_tables(struct mlx5_eswitch *esw) struct mlx5_vport *vport; vport = mlx5_eswitch_get_vport(esw, MLX5_VPORT_UPLINK); + if (IS_ERR(vport)) + return PTR_ERR(vport); + return esw_vport_create_offloads_acl_tables(esw, vport); } @@ -2573,6 +2590,9 @@ static void esw_destroy_uplink_offloads_acl_tables(struct mlx5_eswitch *esw) struct mlx5_vport *vport; vport = mlx5_eswitch_get_vport(esw, MLX5_VPORT_UPLINK); + if (IS_ERR(vport)) + return; + esw_vport_destroy_offloads_acl_tables(esw, vport); } @@ -2584,6 +2604,7 @@ static int esw_offloads_steering_init(struct mlx5_eswitch *esw) memset(&esw->fdb_table.offloads, 0, sizeof(struct offloads_fdb)); mutex_init(&esw->fdb_table.offloads.vports.lock); hash_init(esw->fdb_table.offloads.vports.table); + atomic64_set(&esw->user_count, 0); indir = mlx5_esw_indir_table_init(); if (IS_ERR(indir)) { @@ -2726,10 +2747,25 @@ static int mlx5_esw_host_number_init(struct mlx5_eswitch *esw) return 0; } +bool mlx5_esw_offloads_controller_valid(const struct mlx5_eswitch *esw, u32 controller) +{ + /* Local controller is always valid */ + if (controller == 0) + return true; + + if (!mlx5_core_is_ecpf_esw_manager(esw->dev)) + return false; + + /* External host number starts with zero in device */ + return (controller == esw->offloads.host_number + 1); +} + int esw_offloads_enable(struct mlx5_eswitch *esw) { + struct mapping_ctx *reg_c0_obj_pool; struct mlx5_vport *vport; - int err, i; + unsigned long i; + int err; if (MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, reformat) && MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, decap)) @@ -2744,9 +2780,6 @@ int esw_offloads_enable(struct mlx5_eswitch *esw) if (err) goto err_metadata; - if (esw_check_vport_match_metadata_supported(esw)) - esw->flags |= MLX5_ESWITCH_VPORT_MATCH_METADATA; - err = esw_offloads_metadata_init(esw); if (err) goto err_metadata; @@ -2755,6 +2788,15 @@ int esw_offloads_enable(struct mlx5_eswitch *esw) if (err) goto err_vport_metadata; + reg_c0_obj_pool = mapping_create(sizeof(struct mlx5_mapped_obj), + ESW_REG_C0_USER_DATA_METADATA_MASK, + true); + if (IS_ERR(reg_c0_obj_pool)) { + err = PTR_ERR(reg_c0_obj_pool); + goto err_pool; + } + esw->offloads.reg_c0_obj_pool = reg_c0_obj_pool; + err = esw_offloads_steering_init(esw); if (err) goto err_steering_init; @@ -2781,11 +2823,12 @@ err_vports: err_uplink: esw_offloads_steering_cleanup(esw); err_steering_init: + mapping_destroy(reg_c0_obj_pool); +err_pool: esw_set_passing_vport_metadata(esw, false); err_vport_metadata: esw_offloads_metadata_uninit(esw); err_metadata: - esw->flags &= ~MLX5_ESWITCH_VPORT_MATCH_METADATA; mlx5_rdma_disable_roce(esw->dev); mutex_destroy(&esw->offloads.termtbl_mutex); return err; @@ -2819,8 +2862,8 @@ void esw_offloads_disable(struct mlx5_eswitch *esw) esw_offloads_unload_rep(esw, MLX5_VPORT_UPLINK); esw_set_passing_vport_metadata(esw, false); esw_offloads_steering_cleanup(esw); + mapping_destroy(esw->offloads.reg_c0_obj_pool); esw_offloads_metadata_uninit(esw); - esw->flags &= ~MLX5_ESWITCH_VPORT_MATCH_METADATA; mlx5_rdma_disable_roce(esw->dev); mutex_destroy(&esw->offloads.termtbl_mutex); esw->offloads.encap = DEVLINK_ESWITCH_ENCAP_MODE_NONE; @@ -2925,8 +2968,14 @@ int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode, if (esw_mode_from_devlink(mode, &mlx5_mode)) return -EINVAL; - mutex_lock(&esw->mode_lock); - cur_mlx5_mode = esw->mode; + err = mlx5_esw_try_lock(esw); + if (err < 0) { + NL_SET_ERR_MSG_MOD(extack, "Can't change mode, E-Switch is busy"); + return err; + } + cur_mlx5_mode = err; + err = 0; + if (cur_mlx5_mode == mlx5_mode) goto unlock; @@ -2938,7 +2987,7 @@ int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode, err = -EINVAL; unlock: - mutex_unlock(&esw->mode_lock); + mlx5_esw_unlock(esw); return err; } @@ -2951,14 +3000,45 @@ int mlx5_devlink_eswitch_mode_get(struct devlink *devlink, u16 *mode) if (IS_ERR(esw)) return PTR_ERR(esw); - mutex_lock(&esw->mode_lock); + down_write(&esw->mode_lock); err = eswitch_devlink_esw_mode_check(esw); if (err) goto unlock; err = esw_mode_to_devlink(esw->mode, mode); unlock: - mutex_unlock(&esw->mode_lock); + up_write(&esw->mode_lock); + return err; +} + +static int mlx5_esw_vports_inline_set(struct mlx5_eswitch *esw, u8 mlx5_mode, + struct netlink_ext_ack *extack) +{ + struct mlx5_core_dev *dev = esw->dev; + struct mlx5_vport *vport; + u16 err_vport_num = 0; + unsigned long i; + int err = 0; + + mlx5_esw_for_each_host_func_vport(esw, i, vport, esw->esw_funcs.num_vfs) { + err = mlx5_modify_nic_vport_min_inline(dev, vport->vport, mlx5_mode); + if (err) { + err_vport_num = vport->vport; + NL_SET_ERR_MSG_MOD(extack, + "Failed to set min inline on vport"); + goto revert_inline_mode; + } + } + return 0; + +revert_inline_mode: + mlx5_esw_for_each_host_func_vport(esw, i, vport, esw->esw_funcs.num_vfs) { + if (vport->vport == err_vport_num) + break; + mlx5_modify_nic_vport_min_inline(dev, + vport->vport, + esw->offloads.inline_mode); + } return err; } @@ -2966,15 +3046,15 @@ int mlx5_devlink_eswitch_inline_mode_set(struct devlink *devlink, u8 mode, struct netlink_ext_ack *extack) { struct mlx5_core_dev *dev = devlink_priv(devlink); - int err, vport, num_vport; struct mlx5_eswitch *esw; u8 mlx5_mode; + int err; esw = mlx5_devlink_eswitch_get(devlink); if (IS_ERR(esw)) return PTR_ERR(esw); - mutex_lock(&esw->mode_lock); + down_write(&esw->mode_lock); err = eswitch_devlink_esw_mode_check(esw); if (err) goto out; @@ -3003,27 +3083,16 @@ int mlx5_devlink_eswitch_inline_mode_set(struct devlink *devlink, u8 mode, if (err) goto out; - mlx5_esw_for_each_host_func_vport(esw, vport, esw->esw_funcs.num_vfs) { - err = mlx5_modify_nic_vport_min_inline(dev, vport, mlx5_mode); - if (err) { - NL_SET_ERR_MSG_MOD(extack, - "Failed to set min inline on vport"); - goto revert_inline_mode; - } - } + err = mlx5_esw_vports_inline_set(esw, mlx5_mode, extack); + if (err) + goto out; esw->offloads.inline_mode = mlx5_mode; - mutex_unlock(&esw->mode_lock); + up_write(&esw->mode_lock); return 0; -revert_inline_mode: - num_vport = --vport; - mlx5_esw_for_each_host_func_vport_reverse(esw, vport, num_vport) - mlx5_modify_nic_vport_min_inline(dev, - vport, - esw->offloads.inline_mode); out: - mutex_unlock(&esw->mode_lock); + up_write(&esw->mode_lock); return err; } @@ -3036,14 +3105,14 @@ int mlx5_devlink_eswitch_inline_mode_get(struct devlink *devlink, u8 *mode) if (IS_ERR(esw)) return PTR_ERR(esw); - mutex_lock(&esw->mode_lock); + down_write(&esw->mode_lock); err = eswitch_devlink_esw_mode_check(esw); if (err) goto unlock; err = esw_inline_mode_to_devlink(esw->offloads.inline_mode, mode); unlock: - mutex_unlock(&esw->mode_lock); + up_write(&esw->mode_lock); return err; } @@ -3059,7 +3128,7 @@ int mlx5_devlink_eswitch_encap_mode_set(struct devlink *devlink, if (IS_ERR(esw)) return PTR_ERR(esw); - mutex_lock(&esw->mode_lock); + down_write(&esw->mode_lock); err = eswitch_devlink_esw_mode_check(esw); if (err) goto unlock; @@ -3105,7 +3174,7 @@ int mlx5_devlink_eswitch_encap_mode_set(struct devlink *devlink, } unlock: - mutex_unlock(&esw->mode_lock); + up_write(&esw->mode_lock); return err; } @@ -3120,14 +3189,14 @@ int mlx5_devlink_eswitch_encap_mode_get(struct devlink *devlink, return PTR_ERR(esw); - mutex_lock(&esw->mode_lock); + down_write(&esw->mode_lock); err = eswitch_devlink_esw_mode_check(esw); if (err) goto unlock; *encap = esw->offloads.encap; unlock: - mutex_unlock(&esw->mode_lock); + up_write(&esw->mode_lock); return 0; } @@ -3152,11 +3221,12 @@ void mlx5_eswitch_register_vport_reps(struct mlx5_eswitch *esw, { struct mlx5_eswitch_rep_data *rep_data; struct mlx5_eswitch_rep *rep; - int i; + unsigned long i; esw->offloads.rep_ops[rep_type] = ops; - mlx5_esw_for_all_reps(esw, i, rep) { - if (likely(mlx5_eswitch_vport_has_rep(esw, i))) { + mlx5_esw_for_each_rep(esw, i, rep) { + if (likely(mlx5_eswitch_vport_has_rep(esw, rep->vport))) { + rep->esw = esw; rep_data = &rep->rep_data[rep_type]; atomic_set(&rep_data->state, REP_REGISTERED); } @@ -3167,12 +3237,12 @@ EXPORT_SYMBOL(mlx5_eswitch_register_vport_reps); void mlx5_eswitch_unregister_vport_reps(struct mlx5_eswitch *esw, u8 rep_type) { struct mlx5_eswitch_rep *rep; - int i; + unsigned long i; if (esw->mode == MLX5_ESWITCH_OFFLOADS) __unload_reps_all_vport(esw, rep_type); - mlx5_esw_for_all_reps(esw, i, rep) + mlx5_esw_for_each_rep(esw, i, rep) atomic_set(&rep->rep_data[rep_type].state, REP_UNREGISTERED); } EXPORT_SYMBOL(mlx5_eswitch_unregister_vport_reps); @@ -3213,12 +3283,6 @@ struct mlx5_eswitch_rep *mlx5_eswitch_vport_rep(struct mlx5_eswitch *esw, } EXPORT_SYMBOL(mlx5_eswitch_vport_rep); -bool mlx5_eswitch_is_vf_vport(const struct mlx5_eswitch *esw, u16 vport_num) -{ - return vport_num >= MLX5_VPORT_FIRST_VF && - vport_num <= esw->dev->priv.sriov.max_vfs; -} - bool mlx5_eswitch_reg_c1_loopback_enabled(const struct mlx5_eswitch *esw) { return !!(esw->flags & MLX5_ESWITCH_REG_C1_LOOPBACK_ENABLED); @@ -3244,7 +3308,7 @@ u32 mlx5_eswitch_get_vport_metadata_for_match(struct mlx5_eswitch *esw, EXPORT_SYMBOL(mlx5_eswitch_get_vport_metadata_for_match); int mlx5_esw_offloads_sf_vport_enable(struct mlx5_eswitch *esw, struct devlink_port *dl_port, - u16 vport_num, u32 sfnum) + u16 vport_num, u32 controller, u32 sfnum) { int err; @@ -3252,7 +3316,7 @@ int mlx5_esw_offloads_sf_vport_enable(struct mlx5_eswitch *esw, struct devlink_p if (err) return err; - err = mlx5_esw_devlink_sf_port_register(esw, dl_port, vport_num, sfnum); + err = mlx5_esw_devlink_sf_port_register(esw, dl_port, vport_num, controller, sfnum); if (err) goto devlink_err; |