summaryrefslogtreecommitdiffstats
path: root/net/bridge/br_sysfs_br.c
Commit message (Collapse)AuthorAgeFilesLines
* net: bridge: fdb: add support for fine-grained flushingNikolay Aleksandrov2022-04-131-1/+5
| | | | | | | | | | | | | | Add the ability to specify exactly which fdbs to be flushed. They are described by a new structure - net_bridge_fdb_flush_desc. Currently it can match on port/bridge ifindex, vlan id and fdb flags. It is used to describe the existing dynamic fdb flush operation. Note that this flush operation doesn't treat permanent entries in a special way (fdb_delete vs fdb_delete_local), it will delete them regardless if any port is using them, so currently it can't directly replace deletes which need to handle that case, although we can extend it later for that too. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2021-12-301-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | drivers/net/ethernet/mellanox/mlx5/core/en_tc.c commit 077cdda764c7 ("net/mlx5e: TC, Fix memory leak with rules with internal port") commit 31108d142f36 ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'") commit 4390c6edc0fb ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'") https://lore.kernel.org/all/20211229065352.30178-1-saeed@kernel.org/ net/smc/smc_wr.c commit 49dc9013e34b ("net/smc: Use the bitmap API when applicable") commit 349d43127dac ("net/smc: fix kernel panic caused by race of smc_sock") bitmap_zero()/memset() is removed by the fix Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * net: bridge: mcast: add and enforce startup query interval minimumNikolay Aleksandrov2021-12-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As reported[1] if startup query interval is set too low in combination with large number of startup queries and we have multiple bridges or even a single bridge with multiple querier vlans configured we can crash the machine. Add a 1 second minimum which must be enforced by overwriting the value if set lower (i.e. without returning an error) to avoid breaking user-space. If that happens a log message is emitted to let the admin know that the startup interval has been set to the minimum. It doesn't make sense to make the startup interval lower than the normal query interval so use the same value of 1 second. The issue has been present since these intervals could be user-controlled. [1] https://lore.kernel.org/netdev/e8b9ce41-57b9-b6e2-a46a-ff9c791cf0ba@gmail.com/ Fixes: d902eee43f19 ("bridge: Add multicast count/interval sysfs entries") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * net: bridge: mcast: add and enforce query interval minimumNikolay Aleksandrov2021-12-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As reported[1] if query interval is set too low and we have multiple bridges or even a single bridge with multiple querier vlans configured we can crash the machine. Add a 1 second minimum which must be enforced by overwriting the value if set lower (i.e. without returning an error) to avoid breaking user-space. If that happens a log message is emitted to let the administrator know that the interval has been set to the minimum. The issue has been present since these intervals could be user-controlled. [1] https://lore.kernel.org/netdev/e8b9ce41-57b9-b6e2-a46a-ff9c791cf0ba@gmail.com/ Fixes: d902eee43f19 ("bridge: Add multicast count/interval sysfs entries") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | net: bridge: Allow base 16 inputs in sysfsIdo Schimmel2021-11-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cited commit converted simple_strtoul() to kstrtoul() as suggested by the former's documentation. However, it also forced all the inputs to be decimal resulting in user space breakage. Fix by setting the base to '0' so that the base is automatically detected. Before: # ip link add name br0 type bridge vlan_filtering 1 # echo "0x88a8" > /sys/class/net/br0/bridge/vlan_protocol bash: echo: write error: Invalid argument After: # ip link add name br0 type bridge vlan_filtering 1 # echo "0x88a8" > /sys/class/net/br0/bridge/vlan_protocol # echo $? 0 Fixes: 520fbdf7fb19 ("net/bridge: replace simple_strtoul to kstrtol") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20211124101122.3321496-1-idosch@idosch.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | net/bridge: replace simple_strtoul to kstrtolBernard Zhao2021-11-191-4/+3
|/ | | | | | | simple_strtoull is obsolete, use kstrtol instead. Signed-off-by: Bernard Zhao <bernard@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: vlan: add support for mcast router global optionNikolay Aleksandrov2021-08-111-1/+1
| | | | | | | | | | Add support to change and retrieve global vlan multicast router state which is used for the bridge itself. We just need to pass multicast context to br_multicast_set_router instead of bridge device and the rest of the logic remains the same. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: vlan: add support for mcast querier global optionNikolay Aleksandrov2021-08-111-2/+2
| | | | | | | | | Add support to change and retrieve global vlan multicast querier state. We just need to pass multicast context to br_multicast_set_querier instead of bridge device and the rest of the logic remains the same. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: mcast: move querier state to the multicast contextNikolay Aleksandrov2021-08-111-1/+1
| | | | | | | | | We need to have the querier state per multicast context in order to have per-vlan control, so remove the internal option bit and move it to the multicast context. Also annotate the lockless reads of the new variable. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: vlan: add support for mcast igmp/mld version global optionsNikolay Aleksandrov2021-08-111-2/+2
| | | | | | | Add support to change and retrieve global vlan IGMP/MLD versions. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: multicast: factor out bridge multicast contextNikolay Aleksandrov2021-07-201-19/+19
| | | | | | | | | Factor out the bridge's global multicast context into a separate structure which will later be used for per-vlan global context. No functional changes intended. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: propagate error code and extack from br_mc_disabled_updateFlorian Fainelli2021-04-141-7/+1
| | | | | | | | | | | | | Some Ethernet switches might only be able to support disabling multicast snooping globally, which is an issue for example when several bridges span the same physical device and request contradictory settings. Propagate the return value of br_mc_disabled_update() such that this limitation is transmitted correctly to user-space. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: propagate extack through store_bridge_parmVladimir Oltean2021-02-141-38/+128
| | | | | | | | | | | | | | | | | | | | The bridge sysfs interface stores parameters for the STP, VLAN, multicast etc subsystems using a predefined function prototype. Sometimes the underlying function being called supports a netlink extended ack message, and we ignore it. Let's expand the store_bridge_parm function prototype to include the extack, and just print it to console, but at least propagate it where applicable. Where not applicable, create a shim function in the br_sysfs_br.c file that discards the extra function argument. This patch allows us to propagate the extack argument to br_vlan_set_default_pvid, br_vlan_set_proto and br_vlan_filter_toggle, and from there, further up in br_changelink from br_netlink.c. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: add warning comments to avoid extending sysfsNikolay Aleksandrov2021-01-291-0/+4
| | | | | | | | We're moving to netlink-only options, so add comments in the bridge's sysfs files to warn against adding any new sysfs entries. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: bridge: Add checks for enabling the STP.Horatiu Vultur2020-04-271-3/+1
| | | | | | | | | | It is not possible to have the MRP and STP running at the same time on the bridge, therefore add check when enabling the STP to check if MRP is already enabled. In that case return error. Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner2019-05-301-5/+1
| | | | | | | | | | | | | | | | | | | | | Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* net: bridge: mark hash_elasticity as obsoleteNikolay Aleksandrov2018-12-051-3/+3
| | | | | | | | | | Now that the bridge multicast uses the generic rhashtable interface we can drop the hash_elasticity option as that is already done for us and it's hardcoded to a maximum of RHT_ELASTICITY (16 currently). Add a warning about the obsolete option when the hash_elasticity is set. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: convert multicast to generic rhashtableNikolay Aleksandrov2018-12-051-1/+7
| | | | | | | | | | | | | | | | | | | | | | The bridge multicast code currently uses a custom resizable hashtable which predates the generic rhashtable interface. It has many shortcomings compared and duplicates functionality that is presently available via the generic rhashtable, so this patch removes the custom rhashtable implementation in favor of the kernel's generic rhashtable. The hash maximum is kept and the rhashtable's size is used to do a loose check if it's reached in which case we revert to the old behaviour and disable further bridge multicast processing. Also now we can support any hash maximum, doesn't need to be a power of 2. v3: add non-rcu br_mdb_get variant and use it where multicast_lock is held to avoid RCU splat, drop hash_max function and just set it directly v2: handle when IGMP snooping is undefined, add br_mdb_init/uninit placeholders Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: add no_linklocal_learn bool optionNikolay Aleksandrov2018-11-271-0/+22
| | | | | | | | | | | | | Use the new boolopt API to add an option which disables learning from link-local packets. The default is kept as before and learning is enabled. This is a simple map from a boolopt bit to a bridge private flag that is tested before learning. v2: pass NULL for extack via sysfs Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: add support for per-port vlan statsNikolay Aleksandrov2018-10-121-0/+17
| | | | | | | | | | | | | | | | | | This patch adds an option to have per-port vlan stats instead of the default global stats. The option can be set only when there are no port vlans in the bridge since we need to allocate the stats if it is set when vlans are being added to ports (and respectively free them when being deleted). Also bump RTNL_MAX_TYPE as the bridge is the largest user of options. The current stats design allows us to add these without any changes to the fast-path, it all comes down to the per-vlan stats pointer which, if this option is enabled, will be allocated for each port vlan instead of using the global bridge-wide one. CC: bridge@lists.linux-foundation.org CC: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: convert mcast options to bitsNikolay Aleksandrov2018-09-261-5/+7
| | | | | | | | | | This patch converts the rest of the mcast options to bits. It also packs the mcast options a little better by moving multicast_mld_version to an existing hole, reducing the net_bridge size by 8 bytes. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: convert and rename mcast disabledNikolay Aleksandrov2018-09-261-1/+1
| | | | | | | | | | | | Convert mcast disabled to an option bit and while doing so convert the logic to check if multicast is enabled instead. That is make the logic follow the option value - if it's set then mcast is enabled and vice versa. This avoids a few confusing places where we inverted the value that's being set to follow the mcast_disabled logic. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: convert group_addr_set option to a bitNikolay Aleksandrov2018-09-261-1/+1
| | | | | | | | Convert group_addr_set internal bridge opt to a bit. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: convert nf call options to bitsNikolay Aleksandrov2018-09-261-6/+6
| | | | | | | | No functional change, convert of nf_call_[ip|ip6|arp]tables to bits. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: add bitfield for options and convert vlan optsNikolay Aleksandrov2018-09-261-2/+2
| | | | | | | | | | | | | Bridge options have usually been added as separate fields all over the net_bridge struct taking up space and ending up in different cache lines. Let's move them to a single bitfield to save up space and speedup lookups. This patch adds a simple API for option modifying and retrieving using bitops and converts the first user of the API - the bridge vlan options (vlan_enabled and vlan_stats_enabled). Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Use octal not symbolic permissionsJoe Perches2018-03-261-1/+1
| | | | | | | | | | | | | | Prefer the direct use of octal for permissions. Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace and some typing. Miscellanea: o Whitespace neatening around these conversions. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: Use helpers to handle MAC addressAndy Shevchenko2017-12-201-10/+3
| | | | | | | | | | | Use %pM to print MAC mac_pton() to convert it from ASCII to binary format, and ether_addr_copy() to copy. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: constify attribute_group structures.Arvind Yadav2017-06-291-1/+1
| | | | | | | | | | | | | | | | | attribute_groups are not supposed to change at runtime. All functions working with attribute_groups provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 2645 896 0 3541 dd5 net/bridge/br_sysfs_br.o File size After adding 'const': text data bss dec hex filename 2701 832 0 3533 dcd net/bridge/br_sysfs_br.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sched/headers: Prepare to move signal wakeup & sigpending methods from ↵Ingo Molnar2017-03-021-0/+1
| | | | | | | | | | | | | <linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* bridge: move to workqueue gcNikolay Aleksandrov2017-02-061-1/+1
| | | | | | | | | | | | | Move the fdb garbage collector to a workqueue which fires at least 10 milliseconds apart and cleans chain by chain allowing for other tasks to run in the meantime. When having thousands of fdbs the system is much more responsive. Most importantly remove the need to check if the matched entry has expired in __br_fdb_get that causes false-sharing and is completely unnecessary if we cleanup entries, at worst we'll get 10ms of traffic for that entry before it gets deleted. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2016-12-061-0/+1
|\
| * net: bridge: set error code on failurePan Bian2016-12-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Function br_sysfs_addbr() does not set error code when the call kobject_create_and_add() returns a NULL pointer. It may be better to return "-ENOMEM" when kobject_create_and_add() fails. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188781 Signed-off-by: Pan Bian <bianpan2016@163.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: mcast: add MLDv2 querier supportNikolay Aleksandrov2016-11-211-0/+22
| | | | | | | | | | | | | | | | | | | | This patch adds basic support for MLDv2 queries, the default is MLDv1 as before. A new multicast option - multicast_mld_version, adds the ability to change it between 1 and 2 via netlink and sysfs. The MLD option is disabled if CONFIG_IPV6 is disabled. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: mcast: add IGMPv3 query supportNikolay Aleksandrov2016-11-211-0/+18
|/ | | | | | | | | | | | | This patch adds basic support for IGMPv3 queries, the default is IGMPv2 as before. A new multicast option - multicast_igmp_version, adds the ability to change it between 2 and 3 via netlink and sysfs. The option struct member is in a 4 byte hole in net_bridge. There also a few minor style adjustments in br_multicast_new_group and br_multicast_add_group. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bridge: add support for IGMP/MLD stats and export them via netlinkNikolay Aleksandrov2016-06-301-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds stats support for the currently used IGMP/MLD types by the bridge. The stats are per-port (plus one stat per-bridge) and per-direction (RX/TX). The stats are exported via netlink via the new linkxstats API (RTM_GETSTATS). In order to minimize the performance impact, a new option is used to enable/disable the stats - multicast_stats_enabled, similar to the recent vlan stats. Also in order to avoid multiple IGMP/MLD type lookups and checks, we make use of the current "igmp" member of the bridge private skb->cb region to record the type on Rx (both host-generated and external packets pass by multicast_rcv()). We can do that since the igmp member was used as a boolean and all the valid IGMP/MLD types are positive values. The normal bridge fast-path is not affected at all, the only affected paths are the flooding ones and since we make use of the IGMP/MLD type, we can quickly determine if the packet should be counted using cache-hot data (cb's igmp member). We add counters for: * IGMP Queries * IGMP Leaves * IGMP v1/v2/v3 reports * MLD Queries * MLD Leaves * MLD v1/v2 reports These are invaluable when monitoring or debugging complex multicast setups with bridges. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: vlan: learn to countNikolay Aleksandrov2016-05-021-0/+17
| | | | | | | | | | | | | | | | Add support for per-VLAN Tx/Rx statistics. Every global vlan context gets allocated a per-cpu stats which is then set in each per-port vlan context for quick access. The br_allowed_ingress() common function is used to account for Rx packets and the br_handle_vlan() common function is used to account for Tx packets. Stats accounting is performed only if the bridge-wide vlan_stats_enabled option is set either via sysfs or netlink. A struct hole between vlan_enabled and vlan_proto is used for the new option so it is in the same cache line. Currently it is binary (on/off) but it is intentionally restricted to exactly 0 and 1 since other values will be used in the future for different purposes (e.g. per-port stats). Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: a netlink notification should be sent when those attributes are ↵Xin Long2016-04-131-12/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | changed by br_sysfs_br Now when we change the attributes of bridge or br_port by netlink, a relevant netlink notification will be sent, but if we change them by ioctl or sysfs, no notification will be sent. We should ensure that whenever those attributes change internally or from sysfs/ioctl, that a netlink notification is sent out to listeners. Also, NetworkManager will use this in the future to listen for out-of-band bridge master attribute updates and incorporate them into the runtime configuration. This patch is used for br_sysfs_br. and we also need to remove some rtnl_trylock in old functions so that we can call it in a common one. For group_addr_store, we cannot make it use store_bridge_parm, because it's not a string-to-long convert, we will add notification on it individually. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: simplify the stp_state_store by calling store_bridge_parmXin Long2016-04-131-15/+9
| | | | | | | | | There are some repetitive codes in stp_state_store, we can remove them by calling store_bridge_parm. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: simplify the forward_delay_store by calling store_bridge_parmXin Long2016-04-131-17/+10
| | | | | | | | | There are some repetitive codes in forward_delay_store, we can remove them by calling store_bridge_parm. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: simplify the flush_store by calling store_bridge_parmXin Long2016-04-131-7/+7
| | | | | | | | | | There are some repetitive codes in flush_store, we can remove them by calling store_bridge_parm, also, it would send rtnl notification after we add it in store_bridge_parm in the following patches. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: use kobj_to_dev instead of to_devGeliang Tang2015-12-231-2/+1
| | | | | | | | kobj_to_dev has been defined in linux/device.h, so I replace to_dev with it. Signed-off-by: Geliang Tang <geliangtang@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: fix gc_timer mod/del race conditionNikolay Aleksandrov2015-10-131-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c62987bbd8a1 ("bridge: push bridge setting ageing_time down to switchdev") introduced a timer race condition because the gc_timer can get rearmed after it's supposedly stopped and flushed in br_dev_delete() leading to a use of freed memory. So take rtnl to sync with bridge destruction when setting ageing_timer. Here's the trace reproduced with these two commands running in parallel: while :; do echo 10000 > /sys/class/net/br0/bridge/ageing_timer; done; while :; do brctl addbr br0; ip l set br0 up; ip l set br0 down; brctl delbr br0; done; [ 300.000029] BUG: unable to handle kernel paging request at ffffffff811c59d3 [ 300.000263] IP: [<ffffffff810f168e>] __internal_add_timer+0x2e/0xd0 [ 300.000422] PGD 1a0f067 PUD 1a10063 PMD 10001e1 [ 300.000639] Oops: 0003 [#1] SMP [ 300.000793] Modules linked in: bridge stp llc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd snd_hda_codec_generic qxl drm_kms_helper psmouse pcspkr ttm snd_hda_intel 9pnet_virtio evdev serio_raw joydev snd_hda_codec 9pnet virtio_balloon drm snd_hwdep virtio_console snd_hda_core pvpanic snd_pcm i2c_piix4 snd_timer acpi_cpufreq parport_pc snd parport soundcore button processor i2c_core ipv6 autofs4 hid_generic usbhid hid ext4 crc16 mbcache jbd2 sg sr_mod cdrom ata_generic virtio_blk virtio_net e1000 ehci_pci uhci_hcd ehci_hcd usbcore usb_common floppy ata_piix libata virtio_pci virtio_ring virtio scsi_mod [ 300.004008] CPU: 1 PID: 1169 Comm: bash Not tainted 4.3.0-rc3+ #46 [ 300.004008] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 300.004008] task: ffff880035be2200 ti: ffff88003795c000 task.ti: ffff88003795c000 [ 300.004008] RIP: 0010:[<ffffffff810f168e>] [<ffffffff810f168e>] __internal_add_timer+0x2e/0xd0 [ 300.004008] RSP: 0018:ffff88003fd03e78 EFLAGS: 00010046 [ 300.004008] RAX: ffff88003fd0ef60 RBX: 840fc78949c08548 RCX: 00000001ffffffff [ 300.004008] RDX: 0000000000000000 RSI: ffffffff811c59d3 RDI: ffff88003fd0df00 [ 300.004008] RBP: ffff88003fd03e78 R08: 00000000ffffffff R09: 0000000000000000 [ 300.004008] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003fd0df00 [ 300.004008] R13: 0000000000000000 R14: 0000000000000001 R15: ffffffff816032e0 [ 300.004008] FS: 00007fcbdd609700(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000 [ 300.004008] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 300.004008] CR2: ffffffff811c59d3 CR3: 0000000037879000 CR4: 00000000000406e0 [ 300.004008] Stack: [ 300.004008] ffff88003fd03ea8 ffffffff810f1775 ffff88003c8cb958 ffff88003fd0df00 [ 300.004008] 0000000000000000 0000000000000001 ffff88003fd03f18 ffffffff810f28c4 [ 300.004008] ffff88003fd0eb68 ffff88003fd0e968 ffff88003fd0e768 ffff88003fd0df68 [ 300.004008] Call Trace: [ 300.004008] <IRQ> [ 300.004008] [<ffffffff810f1775>] cascade+0x45/0x70 [ 300.004008] [<ffffffff810f28c4>] run_timer_softirq+0x2f4/0x340 [ 300.004008] [<ffffffff8107e380>] __do_softirq+0xd0/0x440 [ 300.004008] [<ffffffff8107e8a3>] irq_exit+0xb3/0xc0 [ 300.004008] [<ffffffff815c2032>] smp_apic_timer_interrupt+0x42/0x50 [ 300.004008] [<ffffffff815bfe37>] apic_timer_interrupt+0x87/0x90 [ 300.004008] <EOI> [ 300.004008] [<ffffffff811fb80c>] ? create_object+0x13c/0x2e0 [ 300.004008] [<ffffffff8109b23e>] ? __kernel_text_address+0x4e/0x70 [ 300.004008] [<ffffffff8109b23e>] ? __kernel_text_address+0x4e/0x70 [ 300.004008] [<ffffffff8101e17f>] print_context_stack+0x7f/0xf0 [ 300.004008] [<ffffffff8101d55b>] dump_trace+0x11b/0x300 [ 300.004008] [<ffffffff8102970b>] save_stack_trace+0x2b/0x50 [ 300.004008] [<ffffffff811fb80c>] create_object+0x13c/0x2e0 [ 300.004008] [<ffffffff815b2e8e>] kmemleak_alloc+0x4e/0xb0 [ 300.004008] [<ffffffff811e475d>] kmem_cache_alloc_trace+0x18d/0x2f0 [ 300.004008] [<ffffffff8128b139>] kernfs_fop_open+0xc9/0x380 [ 300.004008] [<ffffffff8120214f>] do_dentry_open+0x1ff/0x2f0 [ 300.004008] [<ffffffff8128b070>] ? kernfs_fop_release+0x70/0x70 [ 300.004008] [<ffffffff812034f9>] vfs_open+0x59/0x60 [ 300.004008] [<ffffffff812130de>] path_openat+0x1ce/0x1260 [ 300.004008] [<ffffffff812154ae>] do_filp_open+0x7e/0xe0 [ 300.004008] [<ffffffff812251ff>] ? __alloc_fd+0xaf/0x180 [ 300.004008] [<ffffffff8120387b>] do_sys_open+0x12b/0x210 [ 300.004008] [<ffffffff8120397e>] SyS_open+0x1e/0x20 [ 300.004008] [<ffffffff815bf0b6>] entry_SYSCALL_64_fastpath+0x16/0x7a [ 300.004008] Code: 66 90 48 8b 46 10 48 8b 4f 40 55 48 89 c2 48 89 e5 48 29 ca 48 81 fa ff 00 00 00 77 20 0f b6 c0 48 8d 44 c7 68 48 8b 10 48 85 d2 <48> 89 16 74 04 48 89 72 08 48 89 30 48 89 46 08 5d c3 48 81 fa [ 300.004008] RIP [<ffffffff810f168e>] __internal_add_timer+0x2e/0xd0 [ 300.004008] RSP <ffff88003fd03e78> [ 300.004008] CR2: ffffffff811c59d3 Fixes: c62987bbd8a1 ("bridge: push bridge setting ageing_time down to switchdev") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: push bridge setting ageing_time down to switchdevScott Feldman2015-10-121-2/+1
| | | | | | | | | | | | | | | Use SWITCHDEV_F_SKIP_EOPNOTSUPP to skip over ports in bridge that don't support setting ageing_time (or setting bridge attrs in general). If push fails, don't update ageing_time in bridge and return err to user. If push succeeds, update ageing_time in bridge and run gc_timer now to recalabrate when to run gc_timer next, based on new ageing_time. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: Add a default_pvid sysfs attributeVlad Yasevich2014-10-051-0/+17
| | | | | | | | | This patch allows the user to set and retrieve default_pvid value. A new value can only be stored when vlan filtering is disabled. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: bridge: move br_netfilter out of the corePablo Neira Ayuso2014-09-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jesper reported that br_netfilter always registers the hooks since this is part of the bridge core. This harms performance for people that don't need this. This patch modularizes br_netfilter so it can be rmmod'ed, thus, the hooks can be unregistered. I think the bridge netfilter should have been a separated module since the beginning, Patrick agreed on that. Note that this is breaking compatibility for users that expect that bridge netfilter is going to be available after explicitly 'modprobe bridge' or via automatic load through brctl. However, the damage can be easily undone by modprobing br_netfilter. The bridge core also spots a message to provide a clue to people that didn't notice that this has been deprecated. On top of that, the plan is that nftables will not rely on this software layer, but integrate the connection tracking into the bridge layer to enable stateful filtering and NAT, which is was bridge netfilter users seem to require. This patch still keeps the fake_dst_ops in the bridge core, since this is required by when the bridge port is initialized. So we can safely modprobe/rmmod br_netfilter anytime. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Florian Westphal <fw@strlen.de>
* bridge: Support 802.1ad vlan filteringToshiaki Makita2014-06-111-0/+26
| | | | | | | | | | | | | | | | | | | | This enables us to change the vlan protocol for vlan filtering. We come to be able to filter frames on the basis of 802.1ad vlan tags through a bridge. This also changes br->group_addr if it has not been set by user. This is needed for an 802.1ad bridge. (See IEEE 802.1Q-2011 8.13.5.) Furthermore, this sets br->group_fwd_mask_required so that an 802.1ad bridge can forward the Nearest Customer Bridge group addresses except for br->group_addr, which should be passed to higher layer. To change the vlan protocol, write a protocol in sysfs: # echo 0x88a8 > /sys/class/net/br0/bridge/vlan_protocol Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: use DEVICE_ATTR_xx macrossfeldma@cumulusnetworks.com2014-01-061-141/+108
| | | | | | | Use DEVICE_ATTR_RO/RW macros to simplify bridge sysfs attribute definitions. Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: correct the comment for file br_sysfs_br.cWang Sheng-Hui2013-08-071-1/+1
| | | | | | | | br_sysfs_if.c is for sysfs attributes of bridge ports, while br_sysfs_br.c is for sysfs attributes of bridge itself. Correct the comment here. Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: use the bridge IP addr as source addr for querierCong Wang2013-05-221-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | Quote from Adam: "If it is believed that the use of 0.0.0.0 as the IP address is what is causing strange behaviour on other devices then is there a good reason that a bridge rather than a router shouldn't be the active querier? If not then using the bridge IP address and having the querier enabled by default may be a reasonable solution (provided that our querier obeys the election rules and shuts up if it sees a query from a lower IP address that isn't 0.0.0.0). Just because a device is the elected querier for IGMP doesn't appear to mean it is required to perform any other routing functions." And introduce a new troggle for it, as suggested by Herbert. Suggested-by: Adam Baker <linux@baker-net.org.uk> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Adam Baker <linux@baker-net.org.uk> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: Add vlan filtering infrastructureVlad Yasevich2013-02-131-0/+21
| | | | | | | | | | | | | | | | | Adds an optional infrustructure component to bridge that would allow native vlan filtering in the bridge. Each bridge port (as well as the bridge device) now get a VLAN bitmap. Each bit in the bitmap is associated with a vlan id. This way if the bit corresponding to the vid is set in the bitmap that the packet with vid is allowed to enter and exit the port. Write access the bitmap is protected by RTNL and read access protected by RCU. Vlan functionality is disabled by default. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>