summaryrefslogtreecommitdiffstats
path: root/net/bridge
Commit message (Collapse)AuthorAgeFilesLines
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2016-04-121-2/+45
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains the first batch of Netfilter updates for your net-next tree. 1) Define pr_fmt() in nf_conntrack, from Weongyo Jeong. 2) Define and register netfilter's afinfo for the bridge family, this comes in preparation for native nfqueue's bridge for nft, from Stephane Bryant. 3) Add new attributes to store layer 2 and VLAN headers to nfqueue, also from Stephane Bryant. 4) Parse new NFQA_VLAN and NFQA_L2HDR nfqueue netlink attributes coming from userspace, from Stephane Bryant. 5) Use net->ipv6.devconf_all->hop_limit instead of hardcoded hop_limit in IPv6 SYNPROXY, from Liping Zhang. 6) Remove unnecessary check for dst == NULL in nf_reject_ipv6, from Haishuang Yan. 7) Deinline ctnetlink event report functions, from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: bridge: add nf_afinfo to enable queuing to userspaceStephane Bryant2016-03-291-2/+45
| | | | | | | | | | | | | | | | | | This just adds and registers a nf_afinfo for the ethernet bridge, which enables queuing to userspace for the AF_BRIDGE family. No checksum computation is done. Signed-off-by: Stephane Bryant <stephane.ml.bryant@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | bridge: Allow set bridge ageing time when switchdev disabledHaishuang Yan2016-03-301-1/+1
| | | | | | | | | | | | | | | | | | | | When NET_SWITCHDEV=n, switchdev_port_attr_set will return -EOPNOTSUPP, we should ignore this error code and continue to set the ageing time. Fixes: c62987bbd8a1 ("bridge: push bridge setting ageing_time down to switchdev") Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netfilter: ipv4: fix NULL dereferenceLiping Zhang2016-03-281-10/+10
| | | | | | | | | | | | | | | | | | | | Commit fa50d974d104 ("ipv4: Namespaceify ip_default_ttl sysctl knob") use sock_net(skb->sk) to get the net namespace, but we can't assume that sk_buff->sk is always exist, so when it is NULL, oops will happen. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Reviewed-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | netfilter: x_tables: enforce nul-terminated table name from getsockopt ↵Pablo Neira Ayuso2016-03-281-0/+4
|/ | | | | | | | | | | GET_ENTRIES Make sure the table names via getsockopt GET_ENTRIES is nul-terminated in ebtables and all the x_tables variants and their respective compat code. Uncovered by KASAN. Reported-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* bridge: update max_gso_segs and max_gso_sizeEric Dumazet2016-03-211-0/+16
| | | | | | | | | | | | | It can be useful to lower max_gso_segs on NIC with very low number of TX descriptors like bcmgenet. However, this is defeated by bridge since it does not propagate the lower value of max_gso_segs and max_gso_size. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Petri Gynther <pgynther@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdictFlorian Westphal2016-03-141-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Zefir Kurtisi reported kernel panic with an openwrt specific patch. However, it turns out that mainline has a similar bug waiting to happen. Once NF_HOOK() returns the skb is in undefined state and must not be used. Moreover, the okfn must consume the skb to support async processing (NF_QUEUE). Current okfn in this spot doesn't consume it and caller assumes that NF_HOOK return value tells us if skb was freed or not, but thats wrong. It "works" because no in-tree user registers a NFPROTO_BRIDGE hook at LOCAL_IN that returns STOLEN or NF_QUEUE verdicts. Once we add NF_QUEUE support for nftables bridge this will break -- NF_QUEUE holds the skb for async processing, caller will erronoulsy return RX_HANDLER_PASS and on reinject netfilter will access free'd skb. Fix this by pushing skb up the stack in the okfn instead. NB: It also seems dubious to use LOCAL_IN while bypassing PRE_ROUTING completely in this case but this is how its been forever so it seems preferable to not change this. Cc: Felix Fietkau <nbd@openwrt.org> Cc: Zefir Kurtisi <zefir.kurtisi@neratec.com> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Zefir Kurtisi <zefir.kurtisi@neratec.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: allow zero ageing timeStephen Hemminger2016-03-111-3/+8
| | | | | | | | | | | | | | | | This fixes a regression in the bridge ageing time caused by: commit c62987bbd8a1 ("bridge: push bridge setting ageing_time down to switchdev") There are users of Linux bridge which use the feature that if ageing time is set to 0 it causes entries to never expire. See: https://www.linuxfoundation.org/collaborate/workgroups/networking/bridge For a pure software bridge, it is unnecessary for the code to have arbitrary restrictions on what values are allowable. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2016-03-081-3/+65
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter updates for your net-next tree, they are: 1) Remove useless debug message when deleting IPVS service, from Yannick Brosseau. 2) Get rid of compilation warning when CONFIG_PROC_FS is unset in several spots of the IPVS code, from Arnd Bergmann. 3) Add prandom_u32 support to nft_meta, from Florian Westphal. 4) Remove unused variable in xt_osf, from Sudip Mukherjee. 5) Don't calculate IP checksum twice from netfilter ipv4 defrag hook since fixing af_packet defragmentation issues, from Joe Stringer. 6) On-demand hook registration for iptables from netns. Instead of registering the hooks for every available netns whenever we need one of the support tables, we register this on the specific netns that needs it, patchset from Florian Westphal. 7) Add missing port range selection to nf_tables masquerading support. BTW, just for the record, there is a typo in the description of 5f6c253ebe93b0 ("netfilter: bridge: register hooks only when bridge interface is added") that refers to the cluster match as deprecated, but it is actually the CLUSTERIP target (which registers hooks inconditionally) the one that is scheduled for removal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: bridge: register hooks only when bridge interface is addedFlorian Westphal2016-03-021-3/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This moves bridge hooks to a register-when-needed scheme. We use a device notifier to register the 'call-iptables' netfilter hooks only once a bridge gets added. This means that if the initial namespace uses a bridge, newly created network namespaces no longer get the PRE_ROUTING ipt_sabotage hook. It will registered in that network namespace once a bridge is created within that namespace. A few modules still use global hooks: - conntrack - bridge PF_BRIDGE hooks - IPVS - CLUSTER match (deprecated) - SYNPROXY As long as these modules are not loaded/used, a new network namespace has empty hook list and NF_HOOK() will boil down to single list_empty test even if initial namespace does stateless packet filtering. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2016-03-081-5/+10
|\ \ | | | | | | | | | | | | | | | | | | | | | Several cases of overlapping changes, as well as one instance (vxlan) of a bug fix in 'net' overlapping with code movement in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: ndo_fdb_dump should report -EMSGSIZE to rtnl_fdb_dump.MINOURA Makoto / 箕浦 真2016-02-261-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the send skbuff reaches the end, nlmsg_put and friends returns -EMSGSIZE but it is silently thrown away in ndo_fdb_dump. It is called within a for_each_netdev loop and the first fdb entry of a following netdev could fit in the remaining skbuff. This breaks the mechanism of cb->args[0] and idx to keep track of the entries that are already dumped, which results missing entries in bridge fdb show command. Signed-off-by: Minoura Makoto <minoura@valinux.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: remove skb_sender_cpu_clear()WANG Cong2016-03-011-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | After commit 52bd2d62ce67 ("net: better skb->sender_cpu and skb->napi_id cohabitation") skb_sender_cpu_clear() becomes empty and can be removed. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: mcast: add support for more router port information dumpingNikolay Aleksandrov2016-03-011-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow for more multicast router port information to be dumped such as timer and type attributes. For that that purpose we need to extend the MDBA_ROUTER_PORT attribute similar to how it was done for the mdb entries recently. The new format is thus: [MDBA_ROUTER_PORT] = { <- nested attribute u32 ifindex <- router port ifindex for user-space compatibility [MDBA_ROUTER_PATTR attributes] } This way it remains compatible with older users (they'll simply retrieve the u32 in the beginning) and new users can parse the remaining attributes. It would also allow to add future extensions to the router port without breaking compatibility. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: mcast: add support for temporary port routerNikolay Aleksandrov2016-03-011-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for a temporary router port which doesn't depend only on the incoming query. It can be refreshed if set to the same value, which is a no-op for the rest. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: mcast: do nothing if port's multicast_router is set to the same valNikolay Aleksandrov2016-03-011-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | This is needed for the upcoming temporary port router. There's no point to go through the logic if the value is the same. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: mcast: use names for the different multicast_router typesNikolay Aleksandrov2016-03-011-28/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | Using raw values makes it difficult to extend and also understand the code, give them names and do explicit per-option manipulation in br_multicast_set_port_router. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: notify enslaved devices of headroom changesPaolo Abeni2016-03-011-2/+35
| |/ |/| | | | | | | | | | | | | On bridge needed_headroom changes, the enslaved devices are notified via the ndo_set_rx_headroom method Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: bridge: use __ethtool_get_ksettingsDavid Decotigny2016-02-251-3/+3
| | | | | | | | | | Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2016-02-231-2/+2
|\| | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/phy/bcm7xxx.c drivers/net/phy/marvell.c drivers/net/vxlan.c All three conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: mdb: avoid uninitialized variable warningArnd Bergmann2016-02-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A recent change to the mdb code confused the compiler to the point where it did not realize that the port-group returned from br_mdb_add_group() is always valid when the function returns a nonzero return value, so we get a spurious warning: net/bridge/br_mdb.c: In function 'br_mdb_add': net/bridge/br_mdb.c:542:4: error: 'pg' may be used uninitialized in this function [-Werror=maybe-uninitialized] __br_mdb_notify(dev, entry, RTM_NEWMDB, pg); Slightly rearranging the code in br_mdb_add_group() makes the problem go away, as gcc is clever enough to see that both functions check for 'ret != 0'. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 9e8430f8d60d ("bridge: mdb: Passing the port-group pointer to br_mdb module") Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: mdb: add support for more attributes and export timerNikolay Aleksandrov2016-02-191-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently mdb entries are exported directly as a structure inside MDBA_MDB_ENTRY_INFO attribute, we can't really extend it without breaking user-space. In order to export new mdb fields, I've converted the MDBA_MDB_ENTRY_INFO into a nested attribute which starts like before with struct br_mdb_entry (without header, as it's casted directly in iproute2) and continues with MDBA_MDB_EATTR_ attributes. This way we keep compatibility with older users and can export new data. I've tested this with iproute2, both with and without support for the added attribute and it works fine. So basically we again have MDBA_MDB_ENTRY_INFO with struct br_mdb_entry inside but it may contain also some additional MDBA_MDB_EATTR_ attributes such as MDBA_MDB_EATTR_TIMER which can be parsed by user-space. So the new structure is: [MDBA_MDB] = { [MDBA_MDB_ENTRY] = { [MDBA_MDB_ENTRY_INFO] [MDBA_MDB_ENTRY_INFO] { <- Nested attribute struct br_mdb_entry <- nla_put_nohdr() [MDBA_MDB_ENTRY attributes] <- normal netlink attributes } } } Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: mdb: reduce the indentation level in br_mdb_fill_infoNikolay Aleksandrov2016-02-191-16/+17
| | | | | | | | | | | | | | | | Switch the port check and skip if it's null, this allows us to reduce one indentation level. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: bridge: log port STP state on changeVivien Didelot2016-02-185-15/+4
| | | | | | | | | | | | | | | | | | Remove the shared br_log_state function and print the info directly in br_set_state, where the net_bridge_port state is actually changed. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: switchdev: Offload VLAN flags to hardware bridgeIdo Schimmel2016-02-181-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When VLANs are created / destroyed on a VLAN filtering bridge (MASTER flag set), the configuration is passed down to the hardware. However, when only the flags (e.g. PVID) are toggled, the configuration is done in the software bridge alone. While it is possible to pass these flags to hardware when invoked with the SELF flag set, this creates inconsistency with regards to the way the VLANs are initially configured. Pass the flags down to the hardware even when the VLAN already exists and only the flags are toggled. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv4: Namespaceify ip_default_ttl sysctl knobNikolay Borisov2016-02-161-3/+5
| | | | | | | | | | Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: mdb: Passing the port-group pointer to br_mdb moduleElad Raz2016-02-093-29/+34
| | | | | | | | | | | | | | | | | | | | Passing the port-group to br_mdb in order to allow direct access to the structure. br_mdb will later use the structure to reflect HW reflection status via "state" variable. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: mdb: Separate br_mdb_entry->state from net_bridge_port_group->stateElad Raz2016-02-093-15/+26
|/ | | | | | | | | Change net_bridge_port_group 'state' member to 'flags' and define new set of flags internal to the kernel. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* switchdev: Require RTNL mutex to be held when sending FDB notificationsIdo Schimmel2016-01-281-2/+1
| | | | | | | | | | | | | | | | | | | | | | When switchdev drivers process FDB notifications from the underlying device they resolve the netdev to which the entry points to and notify the bridge using the switchdev notifier. However, since the RTNL mutex is not held there is nothing preventing the netdev from disappearing in the middle, which will cause br_switchdev_event() to dereference a non-existing netdev. Make switchdev drivers hold the lock at the beginning of the notification processing session and release it once it ends, after notifying the bridge. Also, remove switchdev_mutex and fdb_lock, as they are no longer needed when RTNL mutex is held. Fixes: 03bf0c281234 ("switchdev: introduce switchdev notifier") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: fix lockdep addr_list_lock false positive splatNikolay Aleksandrov2016-01-151-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After promisc mode management was introduced a bridge device could do dev_set_promiscuity from its ndo_change_rx_flags() callback which in turn can be called after the bridge's addr_list_lock has been taken (e.g. by dev_uc_add). This causes a false positive lockdep splat because the port interfaces' addr_list_lock is taken when br_manage_promisc() runs after the bridge's addr list lock was already taken. To remove the false positive introduce a custom bridge addr_list_lock class and set it on bridge init. A simple way to reproduce this is with the following: $ brctl addbr br0 $ ip l add l br0 br0.100 type vlan id 100 $ ip l set br0 up $ ip l set br0.100 up $ echo 1 > /sys/class/net/br0/bridge/vlan_filtering $ brctl addif br0 eth0 Splat: [ 43.684325] ============================================= [ 43.684485] [ INFO: possible recursive locking detected ] [ 43.684636] 4.4.0-rc8+ #54 Not tainted [ 43.684755] --------------------------------------------- [ 43.684906] brctl/1187 is trying to acquire lock: [ 43.685047] (_xmit_ETHER){+.....}, at: [<ffffffff8150169e>] dev_set_rx_mode+0x1e/0x40 [ 43.685460] but task is already holding lock: [ 43.685618] (_xmit_ETHER){+.....}, at: [<ffffffff815072a7>] dev_uc_add+0x27/0x80 [ 43.686015] other info that might help us debug this: [ 43.686316] Possible unsafe locking scenario: [ 43.686743] CPU0 [ 43.686967] ---- [ 43.687197] lock(_xmit_ETHER); [ 43.687544] lock(_xmit_ETHER); [ 43.687886] *** DEADLOCK *** [ 43.688438] May be due to missing lock nesting notation [ 43.688882] 2 locks held by brctl/1187: [ 43.689134] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81510317>] rtnl_lock+0x17/0x20 [ 43.689852] #1: (_xmit_ETHER){+.....}, at: [<ffffffff815072a7>] dev_uc_add+0x27/0x80 [ 43.690575] stack backtrace: [ 43.690970] CPU: 0 PID: 1187 Comm: brctl Not tainted 4.4.0-rc8+ #54 [ 43.691270] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014 [ 43.691770] ffffffff826a25c0 ffff8800369fb8e0 ffffffff81360ceb ffffffff826a25c0 [ 43.692425] ffff8800369fb9b8 ffffffff810d0466 ffff8800369fb968 ffffffff81537139 [ 43.693071] ffff88003a08c880 0000000000000000 00000000ffffffff 0000000002080020 [ 43.693709] Call Trace: [ 43.693931] [<ffffffff81360ceb>] dump_stack+0x4b/0x70 [ 43.694199] [<ffffffff810d0466>] __lock_acquire+0x1e46/0x1e90 [ 43.694483] [<ffffffff81537139>] ? netlink_broadcast_filtered+0x139/0x3e0 [ 43.694789] [<ffffffff8153b5da>] ? nlmsg_notify+0x5a/0xc0 [ 43.695064] [<ffffffff810d10f5>] lock_acquire+0xe5/0x1f0 [ 43.695340] [<ffffffff8150169e>] ? dev_set_rx_mode+0x1e/0x40 [ 43.695623] [<ffffffff815edea5>] _raw_spin_lock_bh+0x45/0x80 [ 43.695901] [<ffffffff8150169e>] ? dev_set_rx_mode+0x1e/0x40 [ 43.696180] [<ffffffff8150169e>] dev_set_rx_mode+0x1e/0x40 [ 43.696460] [<ffffffff8150189c>] dev_set_promiscuity+0x3c/0x50 [ 43.696750] [<ffffffffa0586845>] br_port_set_promisc+0x25/0x50 [bridge] [ 43.697052] [<ffffffffa05869aa>] br_manage_promisc+0x8a/0xe0 [bridge] [ 43.697348] [<ffffffffa05826ee>] br_dev_change_rx_flags+0x1e/0x20 [bridge] [ 43.697655] [<ffffffff81501532>] __dev_set_promiscuity+0x132/0x1f0 [ 43.697943] [<ffffffff81501672>] __dev_set_rx_mode+0x82/0x90 [ 43.698223] [<ffffffff815072de>] dev_uc_add+0x5e/0x80 [ 43.698498] [<ffffffffa05b3c62>] vlan_device_event+0x542/0x650 [8021q] [ 43.698798] [<ffffffff8109886d>] notifier_call_chain+0x5d/0x80 [ 43.699083] [<ffffffff810988b6>] raw_notifier_call_chain+0x16/0x20 [ 43.699374] [<ffffffff814f456e>] call_netdevice_notifiers_info+0x6e/0x80 [ 43.699678] [<ffffffff814f4596>] call_netdevice_notifiers+0x16/0x20 [ 43.699973] [<ffffffffa05872be>] br_add_if+0x47e/0x4c0 [bridge] [ 43.700259] [<ffffffffa058801e>] add_del_if+0x6e/0x80 [bridge] [ 43.700548] [<ffffffffa0588b5f>] br_dev_ioctl+0xaf/0xc0 [bridge] [ 43.700836] [<ffffffff8151a7ac>] dev_ifsioc+0x30c/0x3c0 [ 43.701106] [<ffffffff8151aac9>] dev_ioctl+0xf9/0x6f0 [ 43.701379] [<ffffffff81254345>] ? mntput_no_expire+0x5/0x450 [ 43.701665] [<ffffffff812543ee>] ? mntput_no_expire+0xae/0x450 [ 43.701947] [<ffffffff814d7b02>] sock_do_ioctl+0x42/0x50 [ 43.702219] [<ffffffff814d8175>] sock_ioctl+0x1e5/0x290 [ 43.702500] [<ffffffff81242d0b>] do_vfs_ioctl+0x2cb/0x5c0 [ 43.702771] [<ffffffff81243079>] SyS_ioctl+0x79/0x90 [ 43.703033] [<ffffffff815eebb6>] entry_SYSCALL_64_fastpath+0x16/0x7a CC: Vlad Yasevich <vyasevic@redhat.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: Bridge list <bridge@lists.linux-foundation.org> CC: Andy Gospodarek <gospo@cumulusnetworks.com> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Fixes: 2796d0c648c9 ("bridge: Automatically manage port promiscuous mode.") Reported-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: Reflect MDB entries to hardwareElad Raz2016-01-101-0/+23
| | | | | | | | | | Offload MDB changes per port to hardware Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2016-01-081-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next, they are: 1) Release nf_tables objects on netns destructions via nft_release_afinfo(). 2) Destroy basechain and rules on netdevice removal in the new netdev family. 3) Get rid of defensive check against removal of inactive objects in nf_tables. 4) Pass down netns pointer to our existing nfnetlink callbacks, as well as commit() and abort() nfnetlink callbacks. 5) Allow to invert limit expression in nf_tables, so we can throttle overlimit traffic. 6) Add packet duplication for the netdev family. 7) Add forward expression for the netdev family. 8) Define pr_fmt() in conntrack helpers. 9) Don't leave nfqueue configuration on inconsistent state in case of errors, from Ken-ichirou MATSUZAWA, follow up patches are also from him. 10) Skip queue option handling after unbind. 11) Return error on unknown both in nfqueue and nflog command. 12) Autoload ctnetlink when NFQA_CFG_F_CONNTRACK is set. 13) Add new NFTA_SET_USERDATA attribute to store user data in sets, from Carlos Falgueras. 14) Add support for 64 bit byteordering changes nf_tables, from Florian Westphal. 15) Add conntrack byte/packet counter matching support to nf_tables, also from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: nf_tables: release objects on netns destructionPablo Neira Ayuso2015-12-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | We have to release the existing objects on netns removal otherwise we leak them. Chains are unregistered in first place to make sure no packets are walking on our rules and sets anymore. The object release happens by when we unregister the family via nft_release_afinfo() which is called from nft_unregister_afinfo() from the corresponding __net_exit path in every family. Reported-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2016-01-061-1/+4
|\ \
| * | bridge: Only call /sbin/bridge-stp for the initial network namespaceHannes Frederic Sowa2016-01-051-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [I stole this patch from Eric Biederman. He wrote:] > There is no defined mechanism to pass network namespace information > into /sbin/bridge-stp therefore don't even try to invoke it except > for bridge devices in the initial network namespace. > > It is possible for unprivileged users to cause /sbin/bridge-stp to be > invoked for any network device name which if /sbin/bridge-stp does not > guard against unreasonable arguments or being invoked twice on the > same network device could cause problems. [Hannes: changed patch using netns_eq] Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: add vlan filtering change for new bridged deviceElad Raz2016-01-061-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | Notifying hardware about newly bridged port vlan-aware changes. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: add vlan filtering change notificationElad Raz2016-01-061-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | Notifying hardware about bridge vlan-aware changes. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: Propagate vlan add failure to userElad Raz2016-01-061-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Disallow adding interfaces to a bridge when vlan filtering operation failed. Send the failure code to the user. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2015-12-311-1/+1
|\| | | |/ |/|
| * switchdev: bridge: Pass ageing time as clock_t instead of jiffiesIdo Schimmel2015-12-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bridge's ageing time is offloaded to hardware when: 1) A port joins a bridge 2) The ageing time of the bridge is changed In the first case the ageing time is offloaded as jiffies, but in the second case it's offloaded as clock_t, which is what existing switchdev drivers expect to receive. Fixes: 6ac311ae8bfb ("Adding switchdev ageing notification on port bridged") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: use kobj_to_dev instead of to_devGeliang Tang2015-12-231-2/+1
| | | | | | | | | | | | | | | | kobj_to_dev has been defined in linux/device.h, so I replace to_dev with it. Signed-off-by: Geliang Tang <geliangtang@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2015-12-188-83/+91
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains the first batch of Netfilter updates for the upcoming 4.5 kernel. This batch contains userspace netfilter header compilation fixes, support for packet mangling in nf_tables, the new tracing infrastructure for nf_tables and cgroup2 support for iptables. More specifically, they are: 1) Two patches to include dependencies in our netfilter userspace headers to resolve compilation problems, from Mikko Rapeli. 2) Four comestic cleanup patches for the ebtables codebase, from Ian Morris. 3) Remove duplicate include in the netfilter reject infrastructure, from Stephen Hemminger. 4) Two patches to simplify the netfilter defragmentation code for IPv6, patch from Florian Westphal. 5) Fix root ownership of /proc/net netfilter for unpriviledged net namespaces, from Philip Whineray. 6) Get rid of unused fields in struct nft_pktinfo, from Florian Westphal. 7) Add mangling support to our nf_tables payload expression, from Patrick McHardy. 8) Introduce a new netlink-based tracing infrastructure for nf_tables, from Florian Westphal. 9) Change setter functions in nfnetlink_log to be void, from Rami Rosen. 10) Add netns support to the cttimeout infrastructure. 11) Add cgroup2 support to iptables, from Tejun Heo. 12) Introduce nfnl_dereference_protected() in nfnetlink, from Florian. 13) Add support for mangling pkttype in the nf_tables meta expression, also from Florian. BTW, I need that you pull net into net-next, I have another batch that requires changes that I don't yet see in net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * \ Merge branch 'master' of ↵Pablo Neira Ayuso2015-12-141-1/+1
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Resolve conflict between commit 264640fc2c5f4f ("ipv6: distinguish frag queues by device for multicast and link-local packets") from the net tree and commit 029f7f3b8701c ("netfilter: ipv6: nf_defrag: avoid/free clone operations") from the nf-next tree. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Conflicts: net/ipv6/netfilter/nf_conntrack_reasm.c
| * | | netfilter: nf_tables: wrap tracing with a static keyFlorian Westphal2015-12-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only needed when meta nftrace rule(s) were added. The assumption is that no such rules are active, so the call to nft_trace_init is "never" needed. When nftrace rules are active, we always call the nft_trace_* functions, but will only send netlink messages when all of the following are true: - traceinfo structure was initialised - skb->nf_trace == 1 - at least one subscriber to trace group. Adding an extra conditional (static_branch ... && skb->nf_trace) nft_trace_init( ..) Is possible but results in a larger nft_do_chain footprint. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | netfilter-bridge: layout of if statementsIan Morris2015-11-232-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eliminate some checkpatch issues by improved layout of if statements. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | netfilter-bridge: brace placementIan Morris2015-11-232-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change brace placement to eliminate checkpatch error. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | netfilter-bridge: use netdev style commentsIan Morris2015-11-233-46/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes comments to use netdev style. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | netfilter-bridge: Cleanse indentationIan Morris2015-11-234-25/+25
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | Fixes a bunch of issues detected by checkpatch with regards to code indentation. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | | switchdev: Pass original device to port netdev driverIdo Schimmel2015-12-154-0/+6
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | switchdev drivers need to know the netdev on which the switchdev op was invoked. For example, the STP state of a VLAN interface configured on top of a port can change while being member in a bridge. In this case, the underlying driver should only change the STP state of that particular VLAN and not of all the VLANs configured on the port. However, current switchdev infrastructure only passes the port netdev down to the driver. Solve that by passing the original device down to the driver as part of the required switchdev object / attribute. This doesn't entail any change in current switchdev drivers. It simply enables those supporting stacked devices to know the originating device and act accordingly. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: add possibility to pass information about upper device via notifierJiri Pirko2015-12-031-1/+1
| | | | | | | | | | | | | | | | | | Sometimes the drivers and other code would find it handy to know some internal information about upper device being changed. So allow upper-code to pass information down to notifier listeners during linking. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>