summaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAgeFilesLines
...
| | * | | | | | batman-adv: always run purge_orig_neighborsSimon Wunderlich2014-05-101-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current code will not execute batadv_purge_orig_neighbors() when an orig_ifinfo has already been purged. However we need to run it in any case. Fix that. This is a regression introduced by 7351a4822d42827ba0110677c0cbad88a3d52585 ("batman-adv: split out router from orig_node") Signed-off-by: Simon Wunderlich <simon@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
| | * | | | | | batman-adv: fix neigh reference imbalanceSimon Wunderlich2014-05-101-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an interface is removed from batman-adv, the orig_ifinfo of a orig_node may be removed without releasing the router first. This will prevent the reference for the neighbor pointed at by the orig_ifinfo->router to be released, and this leak may result in reference leaks for the interface used by this neighbor. Fix that. This is a regression introduced by 7351a4822d42827ba0110677c0cbad88a3d52585 ("batman-adv: split out router from orig_node"). Reported-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Simon Wunderlich <simon@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
| | * | | | | | batman-adv: fix neigh_ifinfo imbalanceSimon Wunderlich2014-05-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The neigh_ifinfo object must be freed if it has been used in batadv_iv_ogm_process_per_outif(). This is a regression introduced by 89652331c00f43574515059ecbf262d26d885717 ("batman-adv: split tq information in neigh_node struct") Reported-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Simon Wunderlich <simon@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
| * | | | | | | neigh: set nud_state to NUD_INCOMPLETE when probing router reachabilityDuan Jiong2014-05-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 7e98056964("ipv6: router reachability probing"), a router falls into NUD_FAILED will be probed. Now if function rt6_select() selects a router which neighbour state is NUD_FAILED, and at the same time function rt6_probe() changes the neighbour state to NUD_PROBE, then function dst_neigh_output() can directly send packets, but actually the neighbour still is unreachable. If we set nud_state to NUD_INCOMPLETE instead NUD_PROBE, packets will not be sent out until the neihbour is reachable. In addition, because the route should be probes with a single NS, so we must set neigh->probes to neigh_max_probes(), then the neigh timer timeout and function neigh_timer_handler() will not send other NS Messages. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | ip6_tunnel: fix potential NULL pointer dereferenceSusant Sahani2014-05-131-1/+1
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function ip6_tnl_validate assumes that the rtnl attribute IFLA_IPTUN_PROTO always be filled . If this attribute is not filled by the userspace application kernel get crashed with NULL pointer dereference. This patch fixes the potential kernel crash when IFLA_IPTUN_PROTO is missing . Signed-off-by: Susant Sahani <susant@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | Merge branch 'for-davem' of ↵David S. Miller2014-05-096-9/+18
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-05-08 This one is all from Johannes: "Here are a few small fixes for the current cycle: radiotap TX flags were wrong (fix by Bob), Chun-Yeow fixes an SMPS issue with mesh interfaces, Eliad fixes a locking bug and a cfg80211 state problem and finally Henning sent me a fix for IBSS rate information." Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * \ \ \ \ \ Merge branch 'master' of ↵John W. Linville2014-05-086-9/+18
| | |\ \ \ \ \ \ | | | | |/ / / / | | | |/| | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
| | | * | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211John W. Linville2014-05-066-9/+18
| | | |\ \ \ \ \ | | | | | |/ / / | | | | |/| | |
| | | | * | | | mac80211: fix nested rtnl locking on ieee80211_reconfigEliad Peller2014-05-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ieee80211_reconfig already holds rtnl, so calling cfg80211_sched_scan_stopped results in deadlock. Use the rtnl-version of this function instead. Fixes: d43c6b6 ("mac80211: reschedule sched scan after HW restart") Cc: stable@vger.kernel.org (3.14+) Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| | | | * | | | cfg80211: add cfg80211_sched_scan_stopped_rtnlEliad Peller2014-05-051-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add locked-version for cfg80211_sched_scan_stopped. This is used for some users that might want to call it when rtnl is already locked. Fixes: d43c6b6 ("mac80211: reschedule sched scan after HW restart") Cc: stable@vger.kernel.org (3.14+) Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| | | | * | | | cfg80211: free sme on connection failuresEliad Peller2014-05-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cfg80211 is notified about connection failures by __cfg80211_connect_result() call. However, this function currently does not free cfg80211 sme. This results in hanging connection attempts in some cases e.g. when mac80211 authentication attempt is denied, we have this function call: ieee80211_rx_mgmt_auth() -> cfg80211_rx_mlme_mgmt() -> cfg80211_process_auth() -> cfg80211_sme_rx_auth() -> __cfg80211_connect_result() but cfg80211_sme_free() is never get called. Fixes: ceca7b712 ("cfg80211: separate internal SME implementation") Cc: stable@vger.kernel.org (3.10+) Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| | | | * | | | mac80211: Fix mac80211 station info rx bitrate for IBSS modeHenning Rogge2014-05-051-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Filter out incoming multicast packages before applying their bitrate to the rx bitrate station info field to prevent them from setting the rx bitrate to the basic multicast rate. Signed-off-by: Henning Rogge <hrogge@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| | | | * | | | mac80211: fixup radiotap tx flags for RTS/CTSBob Copeland2014-04-221-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using RTS/CTS, the CTS-to-Self bit in radiotap TX flags is getting set instead of the RTS bit. Set the correct one. Reported-by: Larry Maxwell <larrymaxwell@agilemesh.com> Signed-off-by: Bob Copeland <bob@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| | | | * | | | mac80211: avoid handling of SMPS for meshChun-Yeow Yeoh2014-04-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch "mac80211: implement SMPS for AP" has caused kernel oops at mesh STA if the peer mesh STA operates in sleep mode and then becomes active mode. It can be easily reproduced by setting the following commands at peer mesh STA: iw mesh0 station set aa:bb:cc:dd:ee:ff mesh_power_mode deep iw mesh0 station set aa:bb:cc:dd:ee:ff mesh_power_mode active Kernel oops will happen at mesh STA aa:bb:cc:dd:ee:ff. Fix this by avoiding SMPS for mesh mode. Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2014-05-096-12/+19
| |\ \ \ \ \ \ \ | | | |_|_|_|/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following batch contains netfilter fixes for your net tree, they are: 1) Fix use after free in nfnetlink when sending a batch for some unsupported subsystem, from Denys Fedoryshchenko. 2) Skip autoload of the nat module if no binding is specified via ctnetlink, from Florian Westphal. 3) Set local_df after netfilter defragmentation to avoid a bogus ICMP fragmentation needed in the forwarding path, also from Florian. 4) Fix potential user after free in ip6_route_me_harder() when returning the error code to the upper layers, from Sergey Popovich. 5) Skip possible bogus ICMP time exceeded emitted from the router (not valid according to RFC) if conntrack zones are used, from Vasily Averin. 6) Fix fragment handling when nf_defrag_ipv4 is loaded but nf_conntrack is not present, also from Vasily. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | netfilter: Fix potential use after free in ip6_route_me_harder()Sergey Popovich2014-05-091-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dst is released one line before we access it again with dst->error. Fixes: 58e35d147128 netfilter: ipv6: propagate routing errors from ip6_route_me_harder() Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | | | bridge: superfluous skb->nfct check in br_nf_dev_queue_xmitVasily Averin2014-05-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently bridge can silently drop ipv4 fragments. If node have loaded nf_defrag_ipv4 module but have no nf_conntrack_ipv4, br_nf_pre_routing defragments incoming ipv4 fragments but nfct check in br_nf_dev_queue_xmit does not allow re-fragment combined packet back, and therefore it is dropped in br_dev_queue_push_xmit without incrementing of any failcounters It seems the only way to hit the ip_fragment code in the bridge xmit path is to have a fragment list whose reassembled fragments go over the mtu. This only happens if nf_defrag is enabled. Thanks to Florian Westphal for providing feedback to clarify this. Defragmentation ipv4 is required not only in conntracks but at least in TPROXY target and socket match, therefore #ifdef is changed from NF_CONNTRACK_IPV4 to NF_DEFRAG_IPV4 Signed-off-by: Vasily Averin <vvs@openvz.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | | | ipv4: fix "conntrack zones" support for defrag user check in ip_expireVasily Averin2014-05-051-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Defrag user check in ip_expire was not updated after adding support for "conntrack zones". This bug manifests as a RFC violation, since the router will send the icmp time exceeeded message when using conntrack zones. Signed-off-by: Vasily Averin <vvs@openvz.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | | | netfilter: nfnetlink: Fix use after free when it fails to process batchDenys Fedoryshchenko2014-05-041-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This bug manifests when calling the nft command line tool without nf_tables kernel support. kernel message: [ 44.071555] Netfilter messages via NETLINK v0.30. [ 44.072253] BUG: unable to handle kernel NULL pointer dereference at 0000000000000119 [ 44.072264] IP: [<ffffffff8171db1f>] netlink_getsockbyportid+0xf/0x70 [ 44.072272] PGD 7f2b74067 PUD 7f2b73067 PMD 0 [ 44.072277] Oops: 0000 [#1] SMP [...] [ 44.072369] Call Trace: [ 44.072373] [<ffffffff8171fd81>] netlink_unicast+0x91/0x200 [ 44.072377] [<ffffffff817206c9>] netlink_ack+0x99/0x110 [ 44.072381] [<ffffffffa004b951>] nfnetlink_rcv+0x3c1/0x408 [nfnetlink] [ 44.072385] [<ffffffff8171fde3>] netlink_unicast+0xf3/0x200 [ 44.072389] [<ffffffff817201ef>] netlink_sendmsg+0x2ff/0x740 [ 44.072394] [<ffffffff81044752>] ? __mmdrop+0x62/0x90 [ 44.072398] [<ffffffff816dafdb>] sock_sendmsg+0x8b/0xc0 [ 44.072403] [<ffffffff812f1af5>] ? copy_user_enhanced_fast_string+0x5/0x10 [ 44.072406] [<ffffffff816dbb6c>] ? move_addr_to_kernel+0x2c/0x50 [ 44.072410] [<ffffffff816db423>] ___sys_sendmsg+0x3c3/0x3d0 [ 44.072415] [<ffffffff811301ba>] ? handle_mm_fault+0xa9a/0xc60 [ 44.072420] [<ffffffff811362d6>] ? mmap_region+0x166/0x5a0 [ 44.072424] [<ffffffff817da84c>] ? __do_page_fault+0x1dc/0x510 [ 44.072428] [<ffffffff812b8b2c>] ? apparmor_capable+0x1c/0x60 [ 44.072435] [<ffffffff817d6e9a>] ? _raw_spin_unlock_bh+0x1a/0x20 [ 44.072439] [<ffffffff816dfc86>] ? release_sock+0x106/0x150 [ 44.072443] [<ffffffff816dc212>] __sys_sendmsg+0x42/0x80 [ 44.072446] [<ffffffff816dc262>] SyS_sendmsg+0x12/0x20 [ 44.072450] [<ffffffff817df616>] system_call_fastpath+0x1a/0x1f Signed-off-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | | | netfilter: ipv4: defrag: set local_df flag on defragmented skbFlorian Westphal2014-05-041-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | else we may fail to forward skb even if original fragments do fit outgoing link mtu: 1. remote sends 2k packets in two 1000 byte frags, DF set 2. we want to forward but only see '2k > mtu and DF set' 3. we then send icmp error saying that outgoing link is 1500 But original sender never sent a packet that would not fit the outgoing link. Setting local_df makes outgoing path test size vs. IPCB(skb)->frag_max_size, so we will still send the correct error in case the largest original size did not fit outgoing link mtu. Reported-by: Maxime Bizon <mbizon@freebox.fr> Suggested-by: Maxime Bizon <mbizon@freebox.fr> Fixes: 5f2d04f1f9 (ipv4: fix path MTU discovery with connection tracking) Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | | | netfilter: ctnetlink: don't add null bindings if no nat requestedFlorian Westphal2014-04-291-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0eba801b64cc8284d9024c7ece30415a2b981a72 tried to fix a race where nat initialisation can happen after ctnetlink-created conntrack has been created. However, it causes the nat module(s) to be loaded needlessly on systems that are not using NAT. Fortunately, we do not have to create null bindings in that case. conntracks injected via ctnetlink always have the CONFIRMED bit set, which prevents addition of the nat extension in nf_nat_ipv4/6_fn(). We only need to make sure that either no nat extension is added or that we've created both src and dst manips. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | | | | ping: move ping_group_range out of CONFIG_SYSCTLCong Wang2014-05-083-13/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similarly, when CONFIG_SYSCTL is not set, ping_group_range should still work, just that no one can change it. Therefore we should move it out of sysctl_net_ipv4.c. And, it should not share the same seqlock with ip_local_port_range. BTW, rename it to ->ping_group_range instead. Cc: David S. Miller <davem@davemloft.net> Cc: Francois Romieu <romieu@fr.zoreil.com> Reported-by: Stefan de Konink <stefan@konink.de> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | ipv4: move local_port_range out of CONFIG_SYSCTLCong Wang2014-05-084-24/+45
| | |/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_SYSCTL is not set, ip_local_port_range should still work, just that no one can change it. Therefore we should move it out of sysctl_inet.c. Also, rename it to ->ip_local_ports instead. Cc: David S. Miller <davem@davemloft.net> Cc: Francois Romieu <romieu@fr.zoreil.com> Reported-by: Stefan de Konink <stefan@konink.de> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | ipv4: fib_semantics: increment fib_info_cnt after fib_info allocationSergey Popovich2014-05-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Increment fib_info_cnt in fib_create_info() right after successfuly alllocating fib_info structure, overwise fib_metrics allocation failure leads to fib_info_cnt incorrectly decremented in free_fib_info(), called on error path from fib_create_info(). Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | Revert "net: core: introduce netif_skb_dev_features"Florian Westphal2014-05-071-12/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit d206940319c41df4299db75ed56142177bb2e5f6, there are no more callers. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net: ip: push gso skb forwarding handling down the stackFlorian Westphal2014-05-072-53/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Doing the segmentation in the forward path has one major drawback: When using virtio, we may process gso udp packets coming from host network stack. In that case, netfilter POSTROUTING will see one packet with udp header followed by multiple ip fragments. Delay the segmentation and do it after POSTROUTING invocation to avoid this. Fixes: fe6cc55f3a9 ("net: ip, ipv6: handle gso skbs in forwarding path") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net: ipv6: send pkttoobig immediately if orig frag size > mtuFlorian Westphal2014-05-071-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If conntrack defragments incoming ipv6 frags it stores largest original frag size in ip6cb and sets ->local_df. We must thus first test the largest original frag size vs. mtu, and not vice versa. Without this patch PKTTOOBIG is still generated in ip6_fragment() later in the stack, but 1) IPSTATS_MIB_INTOOBIGERRORS won't increment 2) packet did (needlessly) traverse netfilter postrouting hook. Fixes: fe6cc55f3a9 ("net: ip, ipv6: handle gso skbs in forwarding path") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net: ipv4: ip_forward: fix inverted local_df testFlorian Westphal2014-05-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | local_df means 'ignore DF bit if set', so if its set we're allowed to perform ip fragmentation. This wasn't noticed earlier because the output path also drops such skbs (and emits needed icmp error) and because netfilter ip defrag did not set local_df until couple of days ago. Only difference is that DF-packets-larger-than MTU now discarded earlier (f.e. we avoid pointless netfilter postrouting trip). While at it, drop the repeated test ip_exceeds_mtu, checking it once is enough... Fixes: fe6cc55f3a9 ("net: ip, ipv6: handle gso skbs in forwarding path") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | crush: decode and initialize chooseleaf_vary_rIlya Dryomov2014-05-161-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e2b149cc4ba0 ("crush: add chooseleaf_vary_r tunable") added the crush_map::chooseleaf_vary_r field but missed the decode part. This lead to misdirected requests caused by incorrect raw crush mapping sets. Fixes: http://tracker.ceph.com/issues/8226 Reported-and-Tested-by: Dmitry Smirnov <onlyjob@member.fsf.org> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* | | | | | | libceph: fix corruption when using page_count 0 page in rbdChunwei Chen2014-05-161-1/+19
|/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It has been reported that using ZFSonLinux on rbd will result in memory corruption. The bug report can be found here: https://github.com/zfsonlinux/spl/issues/241 http://tracker.ceph.com/issues/7790 The reason is that ZFS will send pages with page_count 0 into rbd, which in turns send them to tcp_sendpage. However, tcp_sendpage cannot deal with page_count 0, as it will do get_page and put_page, and erroneously free the page. This type of issue has been noted before, and handled in iscsi, drbd, etc. So, rbd should also handle this. This fix address this issue by fall back to slower sendmsg when page_count 0 detected. Cc: Sage Weil <sage@inktank.com> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: stable@vger.kernel.org Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com>
* | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2014-05-0533-105/+270
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) e1000e computes header length incorrectly wrt vlans, fix from Vlad Yasevich. 2) ns_capable() check in sock_diag netlink code, from Andrew Lutomirski. 3) Fix invalid queue pairs handling in virtio_net, from Amos Kong. 4) Checksum offloading busted in sxgbe driver due to incorrect descriptor layout, fix from Byungho An. 5) Fix build failure with SMC_DEBUG set to 2 or larger, from Zi Shen Lim. 6) Fix uninitialized A and X registers in BPF interpreter, from Alexei Starovoitov. 7) Fix arch dependencies of candence driver. 8) Fix netlink capabilities checking tree-wide, from Eric W Biederman. 9) Don't dump IFLA_VF_PORTS if netlink request didn't ask for it in IFLA_EXT_MASK, from David Gibson. 10) IPV6 FIB dump restart doesn't handle table changes that happen meanwhile, causing the code to loop forever or emit dups, fix from Kumar Sandararajan. 11) Memory leak on VF removal in bnx2x, from Yuval Mintz. 12) Bug fixes for new Altera TSE driver from Vince Bridgers. 13) Fix route lookup key in SCTP, from Xugeng Zhang. 14) Use BH blocking spinlocks in SLIP, as per a similar fix to CAN/SLCAN driver. From Oliver Hartkopp. 15) TCP doesn't bump retransmit counters in some code paths, fix from Eric Dumazet. 16) Clamp delayed_ack in tcp_cubic to prevent theoretical divides by zero. Fix from Liu Yu. 17) Fix locking imbalance in error paths of HHF packet scheduler, from John Fastabend. 18) Properly reference the transport module when vsock_core_init() runs, from Andy King. 19) Fix buffer overflow in cdc_ncm driver, from Bjørn Mork. 20) IP_ECN_decapsulate() doesn't see a correct SKB network header in ip_tunnel_rcv(), fix from Ying Cai. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (132 commits) net: macb: Fix race between HW and driver net: macb: Remove 'unlikely' optimization net: macb: Re-enable RX interrupt only when RX is done net: macb: Clear interrupt flags net: macb: Pass same size to DMA_UNMAP as used for DMA_MAP ip_tunnel: Set network header properly for IP_ECN_decapsulate() e1000e: Restrict MDIO Slow Mode workaround to relevant parts e1000e: Fix issue with link flap on 82579 e1000e: Expand workaround for 10Mb HD throughput bug e1000e: Workaround for dropped packets in Gig/100 speeds on 82579 net/mlx4_core: Don't issue PCIe speed/width checks for VFs net/mlx4_core: Load the Eth driver first net/mlx4_core: Fix slave id computation for single port VF net/mlx4_core: Adjust port number in qp_attach wrapper when detaching net: cdc_ncm: fix buffer overflow Altera TSE: ALTERA_TSE should depend on HAS_DMA vsock: Make transport the proto owner net: sched: lock imbalance in hhf qdisc net: mvmdio: Check for a valid interrupt instead of an error net phy: Check for aneg completion before setting state to PHY_RUNNING ...
| * | | | | | ip_tunnel: Set network header properly for IP_ECN_decapsulate()Ying Cai2014-05-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In ip_tunnel_rcv(), set skb->network_header to inner IP header before IP_ECN_decapsulate(). Without the fix, IP_ECN_decapsulate() takes outer IP header as inner IP header, possibly causing error messages or packet drops. Note that this skb_reset_network_header() call was in this spot when the original feature for checking consistency of ECN bits through tunnels was added in eccc1bb8d4b4 ("tunnel: drop packet if ECN present with not-ECT"). It was only removed from this spot in 3d7b46cd20e3 ("ip_tunnel: push generic protocol handling to ip_tunnel module."). Fixes: 3d7b46cd20e3 ("ip_tunnel: push generic protocol handling to ip_tunnel module.") Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Ying Cai <ycai@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | vsock: Make transport the proto ownerAndy King2014-05-051-25/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now the core vsock module is the owner of the proto family. This means there's nothing preventing the transport module from unloading if there are open sockets, which results in a panic. Fix that by allowing the transport to be the owner, which will refcount it properly. Includes version bump to 1.0.1.0-k Passes checkpatch this time, I swear... Acked-by: Dmitry Torokhov <dtor@vmware.com> Signed-off-by: Andy King <acking@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | Merge branch 'for-davem' of ↵David S. Miller2014-05-052-3/+12
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-05-01 Please pull the following batch of fixes intended for the 3.15 stream! For the Bluetooth bits, Gustavo says: "Some fixes for 3.15. There is a revert for the intel driver, a new device id, and two important SSP fixes from Johan." On top of that... Ben Hutchings gives us a fix for an unbalanced irq enable in an rtl8192cu error path. Colin Ian King provides an rtlwifi fix for an uninitialized variable. Felix Fietkau brings a pair of ath9k fixes, one that corrects a hardware initialization value and another that removes an (unnecessary) flag that was being used in a way that led to a software tx queue hang in ath9k. Gertjan van Wingerde pushes a MAINTAINERS change to remove himself from the rt2x00 maintainer team. Hans de Goede fixes a brcmfmac firmware load hang. Larry Finger changes rtlwifi to use the correct queue for V0 traffic on rtl8192se. Rajkumar Manoharan corrects a race in ath9k driver initialization. Stanislaw Gruszka fixes an rt2x00 bug in which disabling beaconing once on USB devices led to permanently disabling beaconing for those devices. Tim Harvey provides fixes for a pair of ath9k issues that can lead to soft lockups in that driver. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * \ \ \ \ \ Merge branch 'master' of ↵John W. Linville2014-05-012-3/+12
| | |\ \ \ \ \ \ | | | | |/ / / / | | | |/| | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
| | | * | | | | Bluetooth: Fix redundant encryption request for reauthenticationJohan Hedberg2014-04-251-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we're performing reauthentication (in order to elevate the security level from an unauthenticated key to an authenticated one) we do not need to issue any encryption command once authentication completes. Since the trigger for the encryption HCI command is the ENCRYPT_PEND flag this flag should not be set in this scenario. Instead, the REAUTH_PEND flag takes care of all necessary steps for reauthentication. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org
| | | * | | | | Bluetooth: Fix triggering BR/EDR L2CAP Connect too earlyJohan Hedberg2014-04-251-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 1c2e004183178 introduced an event handler for the encryption key refresh complete event with the intent of fixing some LE/SMP cases. However, this event is shared with BR/EDR and there we actually want to act only on the auth_complete event (which comes after the key refresh). If we do not do this we may trigger an L2CAP Connect Request too early and cause the remote side to return a security block error. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org
| * | | | | | | net: sched: lock imbalance in hhf qdiscJohn Fastabend2014-05-041-5/+6
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | hhf_change() takes the sch_tree_lock and releases it but misses the error cases. Fix the missed case here. To reproduce try a command like this, # tc qdisc change dev p3p2 root hhf quantum 40960 non_hh_weight 300000 Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | tcp_cubic: fix the range of delayed_ackLiu Yu2014-04-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b9f47a3aaeab (tcp_cubic: limit delayed_ack ratio to prevent divide error) try to prevent divide error, but there is still a little chance that delayed_ack can reach zero. In case the param cnt get negative value, then ratio+cnt would overflow and may happen to be zero. As a result, min(ratio, ACK_RATIO_LIMIT) will calculate to be zero. In some old kernels, such as 2.6.32, there is a bug that would pass negative param, which then ultimately leads to this divide error. commit 5b35e1e6e9c (tcp: fix tcp_trim_head() to adjust segment count with skb MSS) fixed the negative param issue. However, it's safe that we fix the range of delayed_ack as well, to make sure we do not hit a divide by zero. CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Liu Yu <allanyuliu@tencent.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | tcp: increment retransmit counters in tlp and fast openEric Dumazet2014-04-301-7/+7
| | |/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both TLP and Fast Open call __tcp_retransmit_skb() instead of tcp_retransmit_skb() to avoid changing tp->retrans_out. This has the side effect of missing SNMP counters increments as well as tcp_info tcpi_total_retrans updates. Fix this by moving the stats increments of into __tcp_retransmit_skb() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | net: ipv6: more places need LOOPBACK_IFINDEX for flowi6_iifJulian Anastasov2014-04-283-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To properly match iif in ip rules we have to provide LOOPBACK_IFINDEX in flowi6_iif, not 0. Some ip6mr_fib_lookup and fib6_rule_lookup callers need such fix. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | net: sctp: Don't transition to PF state when transport has exhausted ↵Karl Heiss2014-04-271-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'Path.Max.Retrans'. Don't transition to the PF state on every strike after 'Path.Max.Retrans'. Per draft-ietf-tsvwg-sctp-failover-03 Section 5.1.6: Additional (PMR - PFMR) consecutive timeouts on a PF destination confirm the path failure, upon which the destination transitions to the Inactive state. As described in [RFC4960], the sender (i) SHOULD notify ULP about this state transition, and (ii) transmit heartbeats to the Inactive destination at a lower frequency as described in Section 8.3 of [RFC4960]. This also prevents sending SCTP_ADDR_UNREACHABLE to the user as the state bounces between SCTP_INACTIVE and SCTP_PF for each subsequent strike. Signed-off-by: Karl Heiss <kheiss@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | sctp: reset flowi4_oif parameter on route lookupXufeng Zhang2014-04-271-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 813b3b5db83 (ipv4: Use caller's on-stack flowi as-is in output route lookups.) introduces another regression which is very similar to the problem of commit e6b45241c (ipv4: reset flowi parameters on route connect) wants to fix: Before we call ip_route_output_key() in sctp_v4_get_dst() to get a dst that matches a bind address as the source address, we have already called this function previously and the flowi parameters have been initialized including flowi4_oif, so when we call this function again, the process in __ip_route_output_key() will be different because of the setting of flowi4_oif, and we'll get a networking device which corresponds to the inputted flowi4_oif as the output device, this is wrong because we'll never hit this place if the previously returned source address of dst match one of the bound addresses. To reproduce this problem, a vlan setting is enough: # ifconfig eth0 up # route del default # vconfig add eth0 2 # vconfig add eth0 3 # ifconfig eth0.2 10.0.1.14 netmask 255.255.255.0 # route add default gw 10.0.1.254 dev eth0.2 # ifconfig eth0.3 10.0.0.14 netmask 255.255.255.0 # ip rule add from 10.0.0.14 table 4 # ip route add table 4 default via 10.0.0.254 src 10.0.0.14 dev eth0.3 # sctp_darn -H 10.0.0.14 -P 36422 -h 10.1.4.134 -p 36422 -s -I You'll detect that all the flow are routed to eth0.2(10.0.1.254). Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | bridge: Handle IFLA_ADDRESS correctly when creating bridge deviceToshiaki Makita2014-04-271-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When bridge device is created with IFLA_ADDRESS, we are not calling br_stp_change_bridge_id(), which leads to incorrect local fdb management and bridge id calculation, and prevents us from receiving frames on the bridge device. Reported-by: Tom Gundersen <teg@jklm.no> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | ipv6: fib: fix fib dump restartKumar Sundararajan2014-04-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the ipv6 fib changes during a table dump, the walk is restarted and the number of nodes dumped are skipped. But the existing code doesn't advance to the next node after a node is skipped. This can cause the dump to loop or produce lots of duplicates when the fib is modified during the dump. This change advances the walk to the next node if the current node is skipped after a restart. Signed-off-by: Kumar Sundararajan <kumar@fb.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | rtnetlink: Only supply IFLA_VF_PORTS information when RTEXT_FILTER_VF is setDavid Gibson2014-04-241-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since 115c9b81928360d769a76c632bae62d15206a94a (rtnetlink: Fix problem with buffer allocation), RTM_NEWLINK messages only contain the IFLA_VFINFO_LIST attribute if they were solicited by a GETLINK message containing an IFLA_EXT_MASK attribute with the RTEXT_FILTER_VF flag. That was done because some user programs broke when they received more data than expected - because IFLA_VFINFO_LIST contains information for each VF it can become large if there are many VFs. However, the IFLA_VF_PORTS attribute, supplied for devices which implement ndo_get_vf_port (currently the 'enic' driver only), has the same problem. It supplies per-VF information and can therefore become large, but it is not currently conditional on the IFLA_EXT_MASK value. Worse, it interacts badly with the existing EXT_MASK handling. When IFLA_EXT_MASK is not supplied, the buffer for netlink replies is fixed at NLMSG_GOODSIZE. If the information for IFLA_VF_PORTS exceeds this, then rtnl_fill_ifinfo() returns -EMSGSIZE on the first message in a packet. netlink_dump() will misinterpret this as having finished the listing and omit data for this interface and all subsequent ones. That can cause getifaddrs(3) to enter an infinite loop. This patch addresses the problem by only supplying IFLA_VF_PORTS when IFLA_EXT_MASK is supplied with the RTEXT_FILTER_VF flag set. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | rtnetlink: Warn when interface's information won't fit in our packetDavid Gibson2014-04-241-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without IFLA_EXT_MASK specified, the information reported for a single interface in response to RTM_GETLINK is expected to fit within a netlink packet of NLMSG_GOODSIZE. If it doesn't, however, things will go badly wrong, When listing all interfaces, netlink_dump() will incorrectly treat -EMSGSIZE on the first message in a packet as the end of the listing and omit information for that interface and all subsequent ones. This can cause getifaddrs(3) to enter an infinite loop. This patch won't fix the problem, but it will WARN_ON() making it easier to track down what's going wrong. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | netfilter: Fix warning in nfnetlink_receive().David S. Miller2014-04-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | net/netfilter/nfnetlink.c: In function ‘nfnetlink_rcv’: net/netfilter/nfnetlink.c:371:14: warning: unused variable ‘net’ [-Wunused-variable] Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | net: Use netlink_ns_capable to verify the permisions of netlink messagesEric W. Biederman2014-04-2415-31/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible by passing a netlink socket to a more privileged executable and then to fool that executable into writing to the socket data that happens to be valid netlink message to do something that privileged executable did not intend to do. To keep this from happening replace bare capable and ns_capable calls with netlink_capable, netlink_net_calls and netlink_ns_capable calls. Which act the same as the previous calls except they verify that the opener of the socket had the desired permissions as well. Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | net: Add variants of capable for use on netlink messagesEric W. Biederman2014-04-241-0/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | netlink_net_capable - The common case use, for operations that are safe on a network namespace netlink_capable - For operations that are only known to be safe for the global root netlink_ns_capable - The general case of capable used to handle special cases __netlink_ns_capable - Same as netlink_ns_capable except taking a netlink_skb_parms instead of the skbuff of a netlink message. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>