diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-04-29 11:57:23 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-04-29 11:57:23 -0700 |
commit | 9d31d2338950293ec19d9b095fbaa9030899dcb4 (patch) | |
tree | e688040d0557c24a2eeb9f6c9c223d949f6f7ef9 /drivers/net/netdevsim/fib.c | |
parent | 635de956a7f5a6ffcb04f29d70630c64c717b56b (diff) | |
parent | 4a52dd8fefb45626dace70a63c0738dbd83b7edb (diff) | |
download | linux-stable-9d31d2338950293ec19d9b095fbaa9030899dcb4.tar.gz linux-stable-9d31d2338950293ec19d9b095fbaa9030899dcb4.tar.bz2 linux-stable-9d31d2338950293ec19d9b095fbaa9030899dcb4.zip |
Merge tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- bpf:
- allow bpf programs calling kernel functions (initially to
reuse TCP congestion control implementations)
- enable task local storage for tracing programs - remove the
need to store per-task state in hash maps, and allow tracing
programs access to task local storage previously added for
BPF_LSM
- add bpf_for_each_map_elem() helper, allowing programs to walk
all map elements in a more robust and easier to verify fashion
- sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT
redirection
- lpm: add support for batched ops in LPM trie
- add BTF_KIND_FLOAT support - mostly to allow use of BTF on
s390 which has floats in its headers files
- improve BPF syscall documentation and extend the use of kdoc
parsing scripts we already employ for bpf-helpers
- libbpf, bpftool: support static linking of BPF ELF files
- improve support for encapsulation of L2 packets
- xdp: restructure redirect actions to avoid a runtime lookup,
improving performance by 4-8% in microbenchmarks
- xsk: build skb by page (aka generic zerocopy xmit) - improve
performance of software AF_XDP path by 33% for devices which don't
need headers in the linear skb part (e.g. virtio)
- nexthop: resilient next-hop groups - improve path stability on
next-hops group changes (incl. offload for mlxsw)
- ipv6: segment routing: add support for IPv4 decapsulation
- icmp: add support for RFC 8335 extended PROBE messages
- inet: use bigger hash table for IP ID generation
- tcp: deal better with delayed TX completions - make sure we don't
give up on fast TCP retransmissions only because driver is slow in
reporting that it completed transmitting the original
- tcp: reorder tcp_congestion_ops for better cache locality
- mptcp:
- add sockopt support for common TCP options
- add support for common TCP msg flags
- include multiple address ids in RM_ADDR
- add reset option support for resetting one subflow
- udp: GRO L4 improvements - improve 'forward' / 'frag_list'
co-existence with UDP tunnel GRO, allowing the first to take place
correctly even for encapsulated UDP traffic
- micro-optimize dev_gro_receive() and flow dissection, avoid
retpoline overhead on VLAN and TEB GRO
- use less memory for sysctls, add a new sysctl type, to allow using
u8 instead of "int" and "long" and shrink networking sysctls
- veth: allow GRO without XDP - this allows aggregating UDP packets
before handing them off to routing, bridge, OvS, etc.
- allow specifing ifindex when device is moved to another namespace
- netfilter:
- nft_socket: add support for cgroupsv2
- nftables: add catch-all set element - special element used to
define a default action in case normal lookup missed
- use net_generic infra in many modules to avoid allocating
per-ns memory unnecessarily
- xps: improve the xps handling to avoid potential out-of-bound
accesses and use-after-free when XPS change race with other
re-configuration under traffic
- add a config knob to turn off per-cpu netdev refcnt to catch
underflows in testing
Device APIs:
- add WWAN subsystem to organize the WWAN interfaces better and
hopefully start driving towards more unified and vendor-
independent APIs
- ethtool:
- add interface for reading IEEE MIB stats (incl. mlx5 and bnxt
support)
- allow network drivers to dump arbitrary SFP EEPROM data,
current offset+length API was a poor fit for modern SFP which
define EEPROM in terms of pages (incl. mlx5 support)
- act_police, flow_offload: add support for packet-per-second
policing (incl. offload for nfp)
- psample: add additional metadata attributes like transit delay for
packets sampled from switch HW (and corresponding egress and
policy-based sampling in the mlxsw driver)
- dsa: improve support for sandwiched LAGs with bridge and DSA
- netfilter:
- flowtable: use direct xmit in topologies with IP forwarding,
bridging, vlans etc.
- nftables: counter hardware offload support
- Bluetooth:
- improvements for firmware download w/ Intel devices
- add support for reading AOSP vendor capabilities
- add support for virtio transport driver
- mac80211:
- allow concurrent monitor iface and ethernet rx decap
- set priority and queue mapping for injected frames
- phy: add support for Clause-45 PHY Loopback
- pci/iov: add sysfs MSI-X vector assignment interface to distribute
MSI-X resources to VFs (incl. mlx5 support)
New hardware/drivers:
- dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port
Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit
interfaces.
- dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and
BCM63xx switches
- Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches
- ath11k: support for QCN9074 a 802.11ax device
- Bluetooth: Broadcom BCM4330 and BMC4334
- phy: Marvell 88X2222 transceiver support
- mdio: add BCM6368 MDIO mux bus controller
- r8152: support RTL8153 and RTL8156 (USB Ethernet) chips
- mana: driver for Microsoft Azure Network Adapter (MANA)
- Actions Semi Owl Ethernet MAC
- can: driver for ETAS ES58X CAN/USB interfaces
Pure driver changes:
- add XDP support to: enetc, igc, stmmac
- add AF_XDP support to: stmmac
- virtio:
- page_to_skb() use build_skb when there's sufficient tailroom
(21% improvement for 1000B UDP frames)
- support XDP even without dedicated Tx queues - share the Tx
queues with the stack when necessary
- mlx5:
- flow rules: add support for mirroring with conntrack, matching
on ICMP, GTP, flex filters and more
- support packet sampling with flow offloads
- persist uplink representor netdev across eswitch mode changes
- allow coexistence of CQE compression and HW time-stamping
- add ethtool extended link error state reporting
- ice, iavf: support flow filters, UDP Segmentation Offload
- dpaa2-switch:
- move the driver out of staging
- add spanning tree (STP) support
- add rx copybreak support
- add tc flower hardware offload on ingress traffic
- ionic:
- implement Rx page reuse
- support HW PTP time-stamping
- octeon: support TC hardware offloads - flower matching on ingress
and egress ratelimitting.
- stmmac:
- add RX frame steering based on VLAN priority in tc flower
- support frame preemption (FPE)
- intel: add cross time-stamping freq difference adjustment
- ocelot:
- support forwarding of MRP frames in HW
- support multiple bridges
- support PTP Sync one-step timestamping
- dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like
learning, flooding etc.
- ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350,
SC7280 SoCs)
- mt7601u: enable TDLS support
- mt76:
- add support for 802.3 rx frames (mt7915/mt7615)
- mt7915 flash pre-calibration support
- mt7921/mt7663 runtime power management fixes"
* tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits)
net: selftest: fix build issue if INET is disabled
net: netrom: nr_in: Remove redundant assignment to ns
net: tun: Remove redundant assignment to ret
net: phy: marvell: add downshift support for M88E1240
net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255
net/sched: act_ct: Remove redundant ct get and check
icmp: standardize naming of RFC 8335 PROBE constants
bpf, selftests: Update array map tests for per-cpu batched ops
bpf: Add batched ops support for percpu array
bpf: Implement formatted output helpers with bstr_printf
seq_file: Add a seq_bprintf function
sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues
net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
net: fix a concurrency bug in l2tp_tunnel_register()
net/smc: Remove redundant assignment to rc
mpls: Remove redundant assignment to err
llc2: Remove redundant assignment to rc
net/tls: Remove redundant initialization of record
rds: Remove redundant assignment to nr_sig
dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0
...
Diffstat (limited to 'drivers/net/netdevsim/fib.c')
-rw-r--r-- | drivers/net/netdevsim/fib.c | 147 |
1 files changed, 135 insertions, 12 deletions
diff --git a/drivers/net/netdevsim/fib.c b/drivers/net/netdevsim/fib.c index 46fb414f7ca6..213d3e5056c8 100644 --- a/drivers/net/netdevsim/fib.c +++ b/drivers/net/netdevsim/fib.c @@ -14,6 +14,7 @@ * THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. */ +#include <linux/bitmap.h> #include <linux/in6.h> #include <linux/kernel.h> #include <linux/list.h> @@ -47,15 +48,18 @@ struct nsim_fib_data { struct nsim_fib_entry nexthops; struct rhashtable fib_rt_ht; struct list_head fib_rt_list; - struct mutex fib_lock; /* Protects hashtable and list */ + struct mutex fib_lock; /* Protects FIB HT and list */ struct notifier_block nexthop_nb; struct rhashtable nexthop_ht; struct devlink *devlink; struct work_struct fib_event_work; struct list_head fib_event_queue; spinlock_t fib_event_queue_lock; /* Protects fib event queue list */ + struct mutex nh_lock; /* Protects NH HT */ struct dentry *ddir; bool fail_route_offload; + bool fail_res_nexthop_group_replace; + bool fail_nexthop_bucket_replace; }; struct nsim_fib_rt_key { @@ -116,6 +120,7 @@ struct nsim_nexthop { struct rhash_head ht_node; u64 occ; u32 id; + bool is_resilient; }; static const struct rhashtable_params nsim_nexthop_ht_params = { @@ -561,7 +566,7 @@ nsim_fib6_rt_create(struct nsim_fib_data *data, err_fib6_rt_nh_del: for (i--; i >= 0; i--) { nsim_fib6_rt_nh_del(fib6_rt, rt_arr[i]); - }; + } nsim_fib_rt_fini(&fib6_rt->common); kfree(fib6_rt); return ERR_PTR(err); @@ -869,10 +874,8 @@ err_rt_offload_failed_flag_set: return err; } -static int nsim_fib_event(struct nsim_fib_event *fib_event) +static void nsim_fib_event(struct nsim_fib_event *fib_event) { - int err = 0; - switch (fib_event->family) { case AF_INET: nsim_fib4_event(fib_event->data, &fib_event->fen_info, @@ -885,8 +888,6 @@ static int nsim_fib_event(struct nsim_fib_event *fib_event) nsim_fib6_event_fini(&fib_event->fib6_event); break; } - - return err; } static int nsim_fib4_prepare_event(struct fib_notifier_info *info, @@ -1118,6 +1119,10 @@ static struct nsim_nexthop *nsim_nexthop_create(struct nsim_fib_data *data, for (i = 0; i < info->nh_grp->num_nh; i++) occ += info->nh_grp->nh_entries[i].weight; break; + case NH_NOTIFIER_INFO_TYPE_RES_TABLE: + occ = info->nh_res_table->num_nh_buckets; + nexthop->is_resilient = true; + break; default: NL_SET_ERR_MSG_MOD(info->extack, "Unsupported nexthop type"); kfree(nexthop); @@ -1160,6 +1165,21 @@ err_num_decrease: } +static void nsim_nexthop_hw_flags_set(struct net *net, + const struct nsim_nexthop *nexthop, + bool trap) +{ + int i; + + nexthop_set_hw_flags(net, nexthop->id, false, trap); + + if (!nexthop->is_resilient) + return; + + for (i = 0; i < nexthop->occ; i++) + nexthop_bucket_set_hw_flags(net, nexthop->id, i, false, trap); +} + static int nsim_nexthop_add(struct nsim_fib_data *data, struct nsim_nexthop *nexthop, struct netlink_ext_ack *extack) @@ -1178,7 +1198,7 @@ static int nsim_nexthop_add(struct nsim_fib_data *data, goto err_nexthop_dismiss; } - nexthop_set_hw_flags(net, nexthop->id, false, true); + nsim_nexthop_hw_flags_set(net, nexthop, true); return 0; @@ -1207,7 +1227,7 @@ static int nsim_nexthop_replace(struct nsim_fib_data *data, goto err_nexthop_dismiss; } - nexthop_set_hw_flags(net, nexthop->id, false, true); + nsim_nexthop_hw_flags_set(net, nexthop, true); nsim_nexthop_account(data, nexthop_old->occ, false, extack); nsim_nexthop_destroy(nexthop_old); @@ -1258,6 +1278,32 @@ static void nsim_nexthop_remove(struct nsim_fib_data *data, nsim_nexthop_destroy(nexthop); } +static int nsim_nexthop_res_table_pre_replace(struct nsim_fib_data *data, + struct nh_notifier_info *info) +{ + if (data->fail_res_nexthop_group_replace) { + NL_SET_ERR_MSG_MOD(info->extack, "Failed to replace a resilient nexthop group"); + return -EINVAL; + } + + return 0; +} + +static int nsim_nexthop_bucket_replace(struct nsim_fib_data *data, + struct nh_notifier_info *info) +{ + if (data->fail_nexthop_bucket_replace) { + NL_SET_ERR_MSG_MOD(info->extack, "Failed to replace nexthop bucket"); + return -EINVAL; + } + + nexthop_bucket_set_hw_flags(info->net, info->id, + info->nh_res_bucket->bucket_index, + false, true); + + return 0; +} + static int nsim_nexthop_event_nb(struct notifier_block *nb, unsigned long event, void *ptr) { @@ -1266,8 +1312,7 @@ static int nsim_nexthop_event_nb(struct notifier_block *nb, unsigned long event, struct nh_notifier_info *info = ptr; int err = 0; - ASSERT_RTNL(); - + mutex_lock(&data->nh_lock); switch (event) { case NEXTHOP_EVENT_REPLACE: err = nsim_nexthop_insert(data, info); @@ -1275,10 +1320,17 @@ static int nsim_nexthop_event_nb(struct notifier_block *nb, unsigned long event, case NEXTHOP_EVENT_DEL: nsim_nexthop_remove(data, info); break; + case NEXTHOP_EVENT_RES_TABLE_PRE_REPLACE: + err = nsim_nexthop_res_table_pre_replace(data, info); + break; + case NEXTHOP_EVENT_BUCKET_REPLACE: + err = nsim_nexthop_bucket_replace(data, info); + break; default: break; } + mutex_unlock(&data->nh_lock); return notifier_from_errno(err); } @@ -1289,11 +1341,68 @@ static void nsim_nexthop_free(void *ptr, void *arg) struct net *net; net = devlink_net(data->devlink); - nexthop_set_hw_flags(net, nexthop->id, false, false); + nsim_nexthop_hw_flags_set(net, nexthop, false); nsim_nexthop_account(data, nexthop->occ, false, NULL); nsim_nexthop_destroy(nexthop); } +static ssize_t nsim_nexthop_bucket_activity_write(struct file *file, + const char __user *user_buf, + size_t size, loff_t *ppos) +{ + struct nsim_fib_data *data = file->private_data; + struct net *net = devlink_net(data->devlink); + struct nsim_nexthop *nexthop; + unsigned long *activity; + loff_t pos = *ppos; + u16 bucket_index; + char buf[128]; + int err = 0; + u32 nhid; + + if (pos != 0) + return -EINVAL; + if (size > sizeof(buf)) + return -EINVAL; + if (copy_from_user(buf, user_buf, size)) + return -EFAULT; + if (sscanf(buf, "%u %hu", &nhid, &bucket_index) != 2) + return -EINVAL; + + rtnl_lock(); + + nexthop = rhashtable_lookup_fast(&data->nexthop_ht, &nhid, + nsim_nexthop_ht_params); + if (!nexthop || !nexthop->is_resilient || + bucket_index >= nexthop->occ) { + err = -EINVAL; + goto out; + } + + activity = bitmap_zalloc(nexthop->occ, GFP_KERNEL); + if (!activity) { + err = -ENOMEM; + goto out; + } + + bitmap_set(activity, bucket_index, 1); + nexthop_res_grp_activity_update(net, nhid, nexthop->occ, activity); + bitmap_free(activity); + +out: + rtnl_unlock(); + + *ppos = size; + return err ?: size; +} + +static const struct file_operations nsim_nexthop_bucket_activity_fops = { + .open = simple_open, + .write = nsim_nexthop_bucket_activity_write, + .llseek = no_llseek, + .owner = THIS_MODULE, +}; + static u64 nsim_fib_ipv4_resource_occ_get(void *priv) { struct nsim_fib_data *data = priv; @@ -1383,6 +1492,17 @@ nsim_fib_debugfs_init(struct nsim_fib_data *data, struct nsim_dev *nsim_dev) data->fail_route_offload = false; debugfs_create_bool("fail_route_offload", 0600, data->ddir, &data->fail_route_offload); + + data->fail_res_nexthop_group_replace = false; + debugfs_create_bool("fail_res_nexthop_group_replace", 0600, data->ddir, + &data->fail_res_nexthop_group_replace); + + data->fail_nexthop_bucket_replace = false; + debugfs_create_bool("fail_nexthop_bucket_replace", 0600, data->ddir, + &data->fail_nexthop_bucket_replace); + + debugfs_create_file("nexthop_bucket_activity", 0200, data->ddir, + data, &nsim_nexthop_bucket_activity_fops); return 0; } @@ -1408,6 +1528,7 @@ struct nsim_fib_data *nsim_fib_create(struct devlink *devlink, if (err) goto err_data_free; + mutex_init(&data->nh_lock); err = rhashtable_init(&data->nexthop_ht, &nsim_nexthop_ht_params); if (err) goto err_debugfs_exit; @@ -1473,6 +1594,7 @@ err_rhashtable_nexthop_destroy: data); mutex_destroy(&data->fib_lock); err_debugfs_exit: + mutex_destroy(&data->nh_lock); nsim_fib_debugfs_exit(data); err_data_free: kfree(data); @@ -1501,6 +1623,7 @@ void nsim_fib_destroy(struct devlink *devlink, struct nsim_fib_data *data) WARN_ON_ONCE(!list_empty(&data->fib_event_queue)); WARN_ON_ONCE(!list_empty(&data->fib_rt_list)); mutex_destroy(&data->fib_lock); + mutex_destroy(&data->nh_lock); nsim_fib_debugfs_exit(data); kfree(data); } |