summaryrefslogtreecommitdiffstats
path: root/drivers/net/bonding
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2024-03-12 17:44:08 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2024-03-12 17:44:08 -0700
commit9187210eee7d87eea37b45ea93454a88681894a4 (patch)
tree31b4610e62cdd5e1dfb700014aa619e41145d7d3 /drivers/net/bonding
parent1f440397665f4241346e4cc6d93f8b73880815d1 (diff)
parented1f164038b50c5864aa85389f3ffd456f050cca (diff)
downloadlinux-9187210eee7d87eea37b45ea93454a88681894a4.tar.gz
linux-9187210eee7d87eea37b45ea93454a88681894a4.tar.bz2
linux-9187210eee7d87eea37b45ea93454a88681894a4.zip
Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski: "Core & protocols: - Large effort by Eric to lower rtnl_lock pressure and remove locks: - Make commonly used parts of rtnetlink (address, route dumps etc) lockless, protected by RCU instead of rtnl_lock. - Add a netns exit callback which already holds rtnl_lock, allowing netns exit to take rtnl_lock once in the core instead of once for each driver / callback. - Remove locks / serialization in the socket diag interface. - Remove 6 calls to synchronize_rcu() while holding rtnl_lock. - Remove the dev_base_lock, depend on RCU where necessary. - Support busy polling on a per-epoll context basis. Poll length and budget parameters can be set independently of system defaults. - Introduce struct net_hotdata, to make sure read-mostly global config variables fit in as few cache lines as possible. - Add optional per-nexthop statistics to ease monitoring / debug of ECMP imbalance problems. - Support TCP_NOTSENT_LOWAT in MPTCP. - Ensure that IPv6 temporary addresses' preferred lifetimes are long enough, compared to other configured lifetimes, and at least 2 sec. - Support forwarding of ICMP Error messages in IPSec, per RFC 4301. - Add support for the independent control state machine for bonding per IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled control state machine. - Add "network ID" to MCTP socket APIs to support hosts with multiple disjoint MCTP networks. - Re-use the mono_delivery_time skbuff bit for packets which user space wants to be sent at a specified time. Maintain the timing information while traversing veth links, bridge etc. - Take advantage of MSG_SPLICE_PAGES for RxRPC DATA and ACK packets. - Simplify many places iterating over netdevs by using an xarray instead of a hash table walk (hash table remains in place, for use on fastpaths). - Speed up scanning for expired routes by keeping a dedicated list. - Speed up "generic" XDP by trying harder to avoid large allocations. - Support attaching arbitrary metadata to netconsole messages. Things we sprinkled into general kernel code: - Enforce VM_IOREMAP flag and range in ioremap_page_range and introduce VM_SPARSE kind and vm_area_[un]map_pages (used by bpf_arena). - Rework selftest harness to enable the use of the full range of ksft exit code (pass, fail, skip, xfail, xpass). Netfilter: - Allow userspace to define a table that is exclusively owned by a daemon (via netlink socket aliveness) without auto-removing this table when the userspace program exits. Such table gets marked as orphaned and a restarting management daemon can re-attach/regain ownership. - Speed up element insertions to nftables' concatenated-ranges set type. Compact a few related data structures. BPF: - Add BPF token support for delegating a subset of BPF subsystem functionality from privileged system-wide daemons such as systemd through special mount options for userns-bound BPF fs to a trusted & unprivileged application. - Introduce bpf_arena which is sparse shared memory region between BPF program and user space where structures inside the arena can have pointers to other areas of the arena, and pointers work seamlessly for both user-space programs and BPF programs. - Introduce may_goto instruction that is a contract between the verifier and the program. The verifier allows the program to loop assuming it's behaving well, but reserves the right to terminate it. - Extend the BPF verifier to enable static subprog calls in spin lock critical sections. - Support registration of struct_ops types from modules which helps projects like fuse-bpf that seeks to implement a new struct_ops type. - Add support for retrieval of cookies for perf/kprobe multi links. - Support arbitrary TCP SYN cookie generation / validation in the TC layer with BPF to allow creating SYN flood handling in BPF firewalls. - Add code generation to inline the bpf_kptr_xchg() helper which improves performance when stashing/popping the allocated BPF objects. Wireless: - Add SPP (signaling and payload protected) AMSDU support. - Support wider bandwidth OFDMA, as required for EHT operation. Driver API: - Major overhaul of the Energy Efficient Ethernet internals to support new link modes (2.5GE, 5GE), share more code between drivers (especially those using phylib), and encourage more uniform behavior. Convert and clean up drivers. - Define an API for querying per netdev queue statistics from drivers. - IPSec: account in global stats for fully offloaded sessions. - Create a concept of Ethernet PHY Packages at the Device Tree level, to allow parameterizing the existing PHY package code. - Enable Rx hashing (RSS) on GTP protocol fields. Misc: - Improvements and refactoring all over networking selftests. - Create uniform module aliases for TC classifiers, actions, and packet schedulers to simplify creating modprobe policies. - Address all missing MODULE_DESCRIPTION() warnings in networking. - Extend the Netlink descriptions in YAML to cover message encapsulation or "Netlink polymorphism", where interpretation of nested attributes depends on link type, classifier type or some other "class type". Drivers: - Ethernet high-speed NICs: - Add a new driver for Marvell's Octeon PCI Endpoint NIC VF. - Intel (100G, ice, idpf): - support E825-C devices - nVidia/Mellanox: - support devices with one port and multiple PCIe links - Broadcom (bnxt): - support n-tuple filters - support configuring the RSS key - Wangxun (ngbe/txgbe): - implement irq_domain for TXGBE's sub-interrupts - Pensando/AMD: - support XDP - optimize queue submission and wakeup handling (+17% bps) - optimize struct layout, saving 28% of memory on queues - Ethernet NICs embedded and virtual: - Google cloud vNIC: - refactor driver to perform memory allocations for new queue config before stopping and freeing the old queue memory - Synopsys (stmmac): - obey queueMaxSDU and implement counters required by 802.1Qbv - Renesas (ravb): - support packet checksum offload - suspend to RAM and runtime PM support - Ethernet switches: - nVidia/Mellanox: - support for nexthop group statistics - Microchip: - ksz8: implement PHY loopback - add support for KSZ8567, a 7-port 10/100Mbps switch - PTP: - New driver for RENESAS FemtoClock3 Wireless clock generator. - Support OCP PTP cards designed and built by Adva. - CAN: - Support recvmsg() flags for own, local and remote traffic on CAN BCM sockets. - Support for esd GmbH PCIe/402 CAN device family. - m_can: - Rx/Tx submission coalescing - wake on frame Rx - WiFi: - Intel (iwlwifi): - enable signaling and payload protected A-MSDUs - support wider-bandwidth OFDMA - support for new devices - bump FW API to 89 for AX devices; 90 for BZ/SC devices - MediaTek (mt76): - mt7915: newer ADIE version support - mt7925: radio temperature sensor support - Qualcomm (ath11k): - support 6 GHz station power modes: Low Power Indoor (LPI), Standard Power) SP and Very Low Power (VLP) - QCA6390 & WCN6855: support 2 concurrent station interfaces - QCA2066 support - Qualcomm (ath12k): - refactoring in preparation for Multi-Link Operation (MLO) support - 1024 Block Ack window size support - firmware-2.bin support - support having multiple identical PCI devices (firmware needs to have ATH12K_FW_FEATURE_MULTI_QRTR_ID) - QCN9274: support split-PHY devices - WCN7850: enable Power Save Mode in station mode - WCN7850: P2P support - RealTek: - rtw88: support for more rtw8811cu and rtw8821cu devices - rtw89: support SCAN_RANDOM_SN and SET_SCAN_DWELL - rtlwifi: speed up USB firmware initialization - rtwl8xxxu: - RTL8188F: concurrent interface support - Channel Switch Announcement (CSA) support in AP mode - Broadcom (brcmfmac): - per-vendor feature support - per-vendor SAE password setup - DMI nvram filename quirk for ACEPC W5 Pro" * tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2255 commits) nexthop: Fix splat with CONFIG_DEBUG_PREEMPT=y nexthop: Fix out-of-bounds access during attribute validation nexthop: Only parse NHA_OP_FLAGS for dump messages that require it nexthop: Only parse NHA_OP_FLAGS for get messages that require it bpf: move sleepable flag from bpf_prog_aux to bpf_prog bpf: hardcode BPF_PROG_PACK_SIZE to 2MB * num_possible_nodes() selftests/bpf: Add kprobe multi triggering benchmarks ptp: Move from simple ida to xarray vxlan: Remove generic .ndo_get_stats64 vxlan: Do not alloc tstats manually devlink: Add comments to use netlink gen tool nfp: flower: handle acti_netdevs allocation failure net/packet: Add getsockopt support for PACKET_COPY_THRESH net/netlink: Add getsockopt support for NETLINK_LISTEN_ALL_NSID selftests/bpf: Add bpf_arena_htab test. selftests/bpf: Add bpf_arena_list test. selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages bpf: Add helper macro bpf_addr_space_cast() libbpf: Recognize __arena global variables. bpftool: Recognize arena map type ...
Diffstat (limited to 'drivers/net/bonding')
-rw-r--r--drivers/net/bonding/bond_3ad.c165
-rw-r--r--drivers/net/bonding/bond_main.c56
-rw-r--r--drivers/net/bonding/bond_netlink.c16
-rw-r--r--drivers/net/bonding/bond_options.c28
4 files changed, 230 insertions, 35 deletions
diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index c99ffe6c683a..c6807e473ab7 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -82,10 +82,6 @@ enum ad_link_speed_type {
#define MAC_ADDRESS_EQUAL(A, B) \
ether_addr_equal_64bits((const u8 *)A, (const u8 *)B)
-static const u8 null_mac_addr[ETH_ALEN + 2] __long_aligned = {
- 0, 0, 0, 0, 0, 0
-};
-
static const u16 ad_ticks_per_sec = 1000 / AD_TIMER_INTERVAL;
static const int ad_delta_in_ticks = (AD_TIMER_INTERVAL * HZ) / 1000;
@@ -106,6 +102,9 @@ static void ad_agg_selection_logic(struct aggregator *aggregator,
static void ad_clear_agg(struct aggregator *aggregator);
static void ad_initialize_agg(struct aggregator *aggregator);
static void ad_initialize_port(struct port *port, int lacp_fast);
+static void ad_enable_collecting(struct port *port);
+static void ad_disable_distributing(struct port *port,
+ bool *update_slave_arr);
static void ad_enable_collecting_distributing(struct port *port,
bool *update_slave_arr);
static void ad_disable_collecting_distributing(struct port *port,
@@ -172,8 +171,37 @@ static inline int __agg_has_partner(struct aggregator *agg)
}
/**
+ * __disable_distributing_port - disable the port's slave for distributing.
+ * Port will still be able to collect.
+ * @port: the port we're looking at
+ *
+ * This will disable only distributing on the port's slave.
+ */
+static void __disable_distributing_port(struct port *port)
+{
+ bond_set_slave_tx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
+}
+
+/**
+ * __enable_collecting_port - enable the port's slave for collecting,
+ * if it's up
+ * @port: the port we're looking at
+ *
+ * This will enable only collecting on the port's slave.
+ */
+static void __enable_collecting_port(struct port *port)
+{
+ struct slave *slave = port->slave;
+
+ if (slave->link == BOND_LINK_UP && bond_slave_is_up(slave))
+ bond_set_slave_rx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER);
+}
+
+/**
* __disable_port - disable the port's slave
* @port: the port we're looking at
+ *
+ * This will disable both collecting and distributing on the port's slave.
*/
static inline void __disable_port(struct port *port)
{
@@ -183,6 +211,8 @@ static inline void __disable_port(struct port *port)
/**
* __enable_port - enable the port's slave, if it's up
* @port: the port we're looking at
+ *
+ * This will enable both collecting and distributing on the port's slave.
*/
static inline void __enable_port(struct port *port)
{
@@ -193,10 +223,27 @@ static inline void __enable_port(struct port *port)
}
/**
- * __port_is_enabled - check if the port's slave is in active state
+ * __port_move_to_attached_state - check if port should transition back to attached
+ * state.
+ * @port: the port we're looking at
+ */
+static bool __port_move_to_attached_state(struct port *port)
+{
+ if (!(port->sm_vars & AD_PORT_SELECTED) ||
+ (port->sm_vars & AD_PORT_STANDBY) ||
+ !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
+ !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION))
+ port->sm_mux_state = AD_MUX_ATTACHED;
+
+ return port->sm_mux_state == AD_MUX_ATTACHED;
+}
+
+/**
+ * __port_is_collecting_distributing - check if the port's slave is in the
+ * combined collecting/distributing state
* @port: the port we're looking at
*/
-static inline int __port_is_enabled(struct port *port)
+static int __port_is_collecting_distributing(struct port *port)
{
return bond_is_active_slave(port->slave);
}
@@ -942,6 +989,7 @@ static int ad_marker_send(struct port *port, struct bond_marker *marker)
*/
static void ad_mux_machine(struct port *port, bool *update_slave_arr)
{
+ struct bonding *bond = __get_bond_by_port(port);
mux_states_t last_state;
/* keep current State Machine state to compare later if it was
@@ -999,9 +1047,13 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
if ((port->sm_vars & AD_PORT_SELECTED) &&
(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
!__check_agg_selection_timer(port)) {
- if (port->aggregator->is_active)
- port->sm_mux_state =
- AD_MUX_COLLECTING_DISTRIBUTING;
+ if (port->aggregator->is_active) {
+ int state = AD_MUX_COLLECTING_DISTRIBUTING;
+
+ if (!bond->params.coupled_control)
+ state = AD_MUX_COLLECTING;
+ port->sm_mux_state = state;
+ }
} else if (!(port->sm_vars & AD_PORT_SELECTED) ||
(port->sm_vars & AD_PORT_STANDBY)) {
/* if UNSELECTED or STANDBY */
@@ -1019,11 +1071,45 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
}
break;
case AD_MUX_COLLECTING_DISTRIBUTING:
+ if (!__port_move_to_attached_state(port)) {
+ /* if port state hasn't changed make
+ * sure that a collecting distributing
+ * port in an active aggregator is enabled
+ */
+ if (port->aggregator->is_active &&
+ !__port_is_collecting_distributing(port)) {
+ __enable_port(port);
+ *update_slave_arr = true;
+ }
+ }
+ break;
+ case AD_MUX_COLLECTING:
+ if (!__port_move_to_attached_state(port)) {
+ if ((port->sm_vars & AD_PORT_SELECTED) &&
+ (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
+ (port->partner_oper.port_state & LACP_STATE_COLLECTING)) {
+ port->sm_mux_state = AD_MUX_DISTRIBUTING;
+ } else {
+ /* If port state hasn't changed, make sure that a collecting
+ * port is enabled for an active aggregator.
+ */
+ struct slave *slave = port->slave;
+
+ if (port->aggregator->is_active &&
+ bond_is_slave_rx_disabled(slave)) {
+ ad_enable_collecting(port);
+ *update_slave_arr = true;
+ }
+ }
+ }
+ break;
+ case AD_MUX_DISTRIBUTING:
if (!(port->sm_vars & AD_PORT_SELECTED) ||
(port->sm_vars & AD_PORT_STANDBY) ||
+ !(port->partner_oper.port_state & LACP_STATE_COLLECTING) ||
!(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
!(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {
- port->sm_mux_state = AD_MUX_ATTACHED;
+ port->sm_mux_state = AD_MUX_COLLECTING;
} else {
/* if port state hasn't changed make
* sure that a collecting distributing
@@ -1031,7 +1117,7 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
*/
if (port->aggregator &&
port->aggregator->is_active &&
- !__port_is_enabled(port)) {
+ !__port_is_collecting_distributing(port)) {
__enable_port(port);
*update_slave_arr = true;
}
@@ -1082,6 +1168,20 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
update_slave_arr);
port->ntt = true;
break;
+ case AD_MUX_COLLECTING:
+ port->actor_oper_port_state |= LACP_STATE_COLLECTING;
+ port->actor_oper_port_state &= ~LACP_STATE_DISTRIBUTING;
+ port->actor_oper_port_state |= LACP_STATE_SYNCHRONIZATION;
+ ad_enable_collecting(port);
+ ad_disable_distributing(port, update_slave_arr);
+ port->ntt = true;
+ break;
+ case AD_MUX_DISTRIBUTING:
+ port->actor_oper_port_state |= LACP_STATE_DISTRIBUTING;
+ port->actor_oper_port_state |= LACP_STATE_SYNCHRONIZATION;
+ ad_enable_collecting_distributing(port,
+ update_slave_arr);
+ break;
default:
break;
}
@@ -1484,7 +1584,7 @@ static void ad_port_selection_logic(struct port *port, bool *update_slave_arr)
(aggregator->partner_system_priority == port->partner_oper.system_priority) &&
(aggregator->partner_oper_aggregator_key == port->partner_oper.key)
) &&
- ((!MAC_ADDRESS_EQUAL(&(port->partner_oper.system), &(null_mac_addr)) && /* partner answers */
+ ((__agg_has_partner(aggregator) && /* partner answers */
!aggregator->is_individual) /* but is not individual OR */
)
) {
@@ -1907,6 +2007,43 @@ static void ad_initialize_port(struct port *port, int lacp_fast)
}
/**
+ * ad_enable_collecting - enable a port's receive
+ * @port: the port we're looking at
+ *
+ * Enable @port if it's in an active aggregator
+ */
+static void ad_enable_collecting(struct port *port)
+{
+ if (port->aggregator->is_active) {
+ struct slave *slave = port->slave;
+
+ slave_dbg(slave->bond->dev, slave->dev,
+ "Enabling collecting on port %d (LAG %d)\n",
+ port->actor_port_number,
+ port->aggregator->aggregator_identifier);
+ __enable_collecting_port(port);
+ }
+}
+
+/**
+ * ad_disable_distributing - disable a port's transmit
+ * @port: the port we're looking at
+ * @update_slave_arr: Does slave array need update?
+ */
+static void ad_disable_distributing(struct port *port, bool *update_slave_arr)
+{
+ if (port->aggregator && __agg_has_partner(port->aggregator)) {
+ slave_dbg(port->slave->bond->dev, port->slave->dev,
+ "Disabling distributing on port %d (LAG %d)\n",
+ port->actor_port_number,
+ port->aggregator->aggregator_identifier);
+ __disable_distributing_port(port);
+ /* Slave array needs an update */
+ *update_slave_arr = true;
+ }
+}
+
+/**
* ad_enable_collecting_distributing - enable a port's transmit/receive
* @port: the port we're looking at
* @update_slave_arr: Does slave array need update?
@@ -1935,9 +2072,7 @@ static void ad_enable_collecting_distributing(struct port *port,
static void ad_disable_collecting_distributing(struct port *port,
bool *update_slave_arr)
{
- if (port->aggregator &&
- !MAC_ADDRESS_EQUAL(&(port->aggregator->partner_system),
- &(null_mac_addr))) {
+ if (port->aggregator && __agg_has_partner(port->aggregator)) {
slave_dbg(port->slave->bond->dev, port->slave->dev,
"Disabling port %d (LAG %d)\n",
port->actor_port_number,
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index cd0683bcca03..2c5ed0a7cb18 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2611,7 +2611,7 @@ static int bond_miimon_inspect(struct bonding *bond)
bond_propose_link_state(slave, BOND_LINK_FAIL);
commit++;
slave->delay = bond->params.downdelay;
- if (slave->delay) {
+ if (slave->delay && net_ratelimit()) {
slave_info(bond->dev, slave->dev, "link status down for %sinterface, disabling it in %d ms\n",
(BOND_MODE(bond) ==
BOND_MODE_ACTIVEBACKUP) ?
@@ -2625,9 +2625,10 @@ static int bond_miimon_inspect(struct bonding *bond)
/* recovered before downdelay expired */
bond_propose_link_state(slave, BOND_LINK_UP);
slave->last_link_up = jiffies;
- slave_info(bond->dev, slave->dev, "link status up again after %d ms\n",
- (bond->params.downdelay - slave->delay) *
- bond->params.miimon);
+ if (net_ratelimit())
+ slave_info(bond->dev, slave->dev, "link status up again after %d ms\n",
+ (bond->params.downdelay - slave->delay) *
+ bond->params.miimon);
commit++;
continue;
}
@@ -2649,7 +2650,7 @@ static int bond_miimon_inspect(struct bonding *bond)
commit++;
slave->delay = bond->params.updelay;
- if (slave->delay) {
+ if (slave->delay && net_ratelimit()) {
slave_info(bond->dev, slave->dev, "link status up, enabling it in %d ms\n",
ignore_updelay ? 0 :
bond->params.updelay *
@@ -2659,9 +2660,10 @@ static int bond_miimon_inspect(struct bonding *bond)
case BOND_LINK_BACK:
if (!link_state) {
bond_propose_link_state(slave, BOND_LINK_DOWN);
- slave_info(bond->dev, slave->dev, "link status down again after %d ms\n",
- (bond->params.updelay - slave->delay) *
- bond->params.miimon);
+ if (net_ratelimit())
+ slave_info(bond->dev, slave->dev, "link status down again after %d ms\n",
+ (bond->params.updelay - slave->delay) *
+ bond->params.miimon);
commit++;
continue;
}
@@ -6305,6 +6307,7 @@ static int __init bond_check_params(struct bond_params *params)
params->ad_actor_sys_prio = ad_actor_sys_prio;
eth_zero_addr(params->ad_actor_system);
params->ad_user_port_key = ad_user_port_key;
+ params->coupled_control = 1;
if (packets_per_slave > 0) {
params->reciprocal_packets_per_slave =
reciprocal_value(packets_per_slave);
@@ -6414,28 +6417,41 @@ static int __net_init bond_net_init(struct net *net)
return 0;
}
-static void __net_exit bond_net_exit_batch(struct list_head *net_list)
+/* According to commit 69b0216ac255 ("bonding: fix bonding_masters
+ * race condition in bond unloading") we need to remove sysfs files
+ * before we remove our devices (done later in bond_net_exit_batch_rtnl())
+ */
+static void __net_exit bond_net_pre_exit(struct net *net)
+{
+ struct bond_net *bn = net_generic(net, bond_net_id);
+
+ bond_destroy_sysfs(bn);
+}
+
+static void __net_exit bond_net_exit_batch_rtnl(struct list_head *net_list,
+ struct list_head *dev_kill_list)
{
struct bond_net *bn;
struct net *net;
- LIST_HEAD(list);
-
- list_for_each_entry(net, net_list, exit_list) {
- bn = net_generic(net, bond_net_id);
- bond_destroy_sysfs(bn);
- }
/* Kill off any bonds created after unregistering bond rtnl ops */
- rtnl_lock();
list_for_each_entry(net, net_list, exit_list) {
struct bonding *bond, *tmp_bond;
bn = net_generic(net, bond_net_id);
list_for_each_entry_safe(bond, tmp_bond, &bn->dev_list, bond_list)
- unregister_netdevice_queue(bond->dev, &list);
+ unregister_netdevice_queue(bond->dev, dev_kill_list);
}
- unregister_netdevice_many(&list);
- rtnl_unlock();
+}
+
+/* According to commit 23fa5c2caae0 ("bonding: destroy proc directory
+ * only after all bonds are gone") bond_destroy_proc_dir() is called
+ * after bond_net_exit_batch_rtnl() has completed.
+ */
+static void __net_exit bond_net_exit_batch(struct list_head *net_list)
+{
+ struct bond_net *bn;
+ struct net *net;
list_for_each_entry(net, net_list, exit_list) {
bn = net_generic(net, bond_net_id);
@@ -6445,6 +6461,8 @@ static void __net_exit bond_net_exit_batch(struct list_head *net_list)
static struct pernet_operations bond_net_ops = {
.init = bond_net_init,
+ .pre_exit = bond_net_pre_exit,
+ .exit_batch_rtnl = bond_net_exit_batch_rtnl,
.exit_batch = bond_net_exit_batch,
.id = &bond_net_id,
.size = sizeof(struct bond_net),
diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
index cfa74cf8bb1a..29b4c3d1b9b6 100644
--- a/drivers/net/bonding/bond_netlink.c
+++ b/drivers/net/bonding/bond_netlink.c
@@ -122,6 +122,7 @@ static const struct nla_policy bond_policy[IFLA_BOND_MAX + 1] = {
[IFLA_BOND_PEER_NOTIF_DELAY] = NLA_POLICY_FULL_RANGE(NLA_U32, &delay_range),
[IFLA_BOND_MISSED_MAX] = { .type = NLA_U8 },
[IFLA_BOND_NS_IP6_TARGET] = { .type = NLA_NESTED },
+ [IFLA_BOND_COUPLED_CONTROL] = { .type = NLA_U8 },
};
static const struct nla_policy bond_slave_policy[IFLA_BOND_SLAVE_MAX + 1] = {
@@ -549,6 +550,16 @@ static int bond_changelink(struct net_device *bond_dev, struct nlattr *tb[],
return err;
}
+ if (data[IFLA_BOND_COUPLED_CONTROL]) {
+ int coupled_control = nla_get_u8(data[IFLA_BOND_COUPLED_CONTROL]);
+
+ bond_opt_initval(&newval, coupled_control);
+ err = __bond_opt_set(bond, BOND_OPT_COUPLED_CONTROL, &newval,
+ data[IFLA_BOND_COUPLED_CONTROL], extack);
+ if (err)
+ return err;
+ }
+
return 0;
}
@@ -615,6 +626,7 @@ static size_t bond_get_size(const struct net_device *bond_dev)
/* IFLA_BOND_NS_IP6_TARGET */
nla_total_size(sizeof(struct nlattr)) +
nla_total_size(sizeof(struct in6_addr)) * BOND_MAX_NS_TARGETS +
+ nla_total_size(sizeof(u8)) + /* IFLA_BOND_COUPLED_CONTROL */
0;
}
@@ -774,6 +786,10 @@ static int bond_fill_info(struct sk_buff *skb,
bond->params.missed_max))
goto nla_put_failure;
+ if (nla_put_u8(skb, IFLA_BOND_COUPLED_CONTROL,
+ bond->params.coupled_control))
+ goto nla_put_failure;
+
if (BOND_MODE(bond) == BOND_MODE_8023AD) {
struct ad_info info;
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index f3f27f0bd2a6..4cdbc7e084f4 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -84,7 +84,8 @@ static int bond_option_ad_user_port_key_set(struct bonding *bond,
const struct bond_opt_value *newval);
static int bond_option_missed_max_set(struct bonding *bond,
const struct bond_opt_value *newval);
-
+static int bond_option_coupled_control_set(struct bonding *bond,
+ const struct bond_opt_value *newval);
static const struct bond_opt_value bond_mode_tbl[] = {
{ "balance-rr", BOND_MODE_ROUNDROBIN, BOND_VALFLAG_DEFAULT},
@@ -232,6 +233,12 @@ static const struct bond_opt_value bond_missed_max_tbl[] = {
{ NULL, -1, 0},
};
+static const struct bond_opt_value bond_coupled_control_tbl[] = {
+ { "on", 1, BOND_VALFLAG_DEFAULT},
+ { "off", 0, 0},
+ { NULL, -1, 0},
+};
+
static const struct bond_option bond_opts[BOND_OPT_LAST] = {
[BOND_OPT_MODE] = {
.id = BOND_OPT_MODE,
@@ -496,6 +503,15 @@ static const struct bond_option bond_opts[BOND_OPT_LAST] = {
.desc = "Delay between each peer notification on failover event, in milliseconds",
.values = bond_peer_notif_delay_tbl,
.set = bond_option_peer_notif_delay_set
+ },
+ [BOND_OPT_COUPLED_CONTROL] = {
+ .id = BOND_OPT_COUPLED_CONTROL,
+ .name = "coupled_control",
+ .desc = "Opt into using coupled control MUX for LACP states",
+ .unsuppmodes = BOND_MODE_ALL_EX(BIT(BOND_MODE_8023AD)),
+ .flags = BOND_OPTFLAG_IFDOWN,
+ .values = bond_coupled_control_tbl,
+ .set = bond_option_coupled_control_set,
}
};
@@ -1692,3 +1708,13 @@ static int bond_option_ad_user_port_key_set(struct bonding *bond,
bond->params.ad_user_port_key = newval->value;
return 0;
}
+
+static int bond_option_coupled_control_set(struct bonding *bond,
+ const struct bond_opt_value *newval)
+{
+ netdev_info(bond->dev, "Setting coupled_control to %s (%llu)\n",
+ newval->string, newval->value);
+
+ bond->params.coupled_control = newval->value;
+ return 0;
+}