| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 0398ba9e5a4b5675aa571e0445689d3c2e499c2d ]
`tipc_node_apply_property` does a null check on a `tipc_link_entry`
pointer but also accesses the same pointer out of the null check block.
This triggers a warning on Coverity Static Analyzer because we're
implying that `e->link` can BE null.
Move "Update MTU for node link entry" line into if block to make sure
that we're not in a state that `e->link` is null.
Signed-off-by: Cengiz Can <cengiz@kernel.wtf>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 7fdd375e383097a785bb65c66802e468f398bf82 ]
TCA_FLOWER_KEY_MPLS_OPT_LSE_LABEL is a u32 attribute (MPLS label is
20 bits long).
Fixes the following bug:
$ tc filter add dev ethX ingress protocol mpls_uc \
flower mpls lse depth 2 label 256 \
action drop
$ tc filter show dev ethX ingress
filter protocol mpls_uc pref 49152 flower chain 0
filter protocol mpls_uc pref 49152 flower chain 0 handle 0x1
eth_type 8847
mpls
lse depth 2 label 0 <-- invalid label 0, should be 256
...
Fixes: 61aec25a6db5 ("cls_flower: Support filtering on multiple MPLS Label Stack Entries")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 5137d303659d8c324e67814b1cc2e1bc0c0d9836 ]
The offending commit introduces a cleanup callback that is invoked
when the driver module is removed to clean up the tunnel device
flow block. But it returns on the first iteration of the for loop.
The remaining indirect flow blocks will never be freed.
Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
CC: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 299bcb55ecd1412f6df606e9dc0912d55610029e ]
When cwnd is not a multiple of the TSO skb size of N*MSS, we can get
into persistent scenarios where we have the following sequence:
(1) ACK for full-sized skb of N*MSS arrives
-> tcp_write_xmit() transmit full-sized skb with N*MSS
-> move pacing release time forward
-> exit tcp_write_xmit() because pacing time is in the future
(2) TSQ callback or TCP internal pacing timer fires
-> try to transmit next skb, but TSO deferral finds remainder of
available cwnd is not big enough to trigger an immediate send
now, so we defer sending until the next ACK.
(3) repeat...
So we can get into a case where we never mark ourselves as
cwnd-limited for many seconds at a time, even with
bulk/infinite-backlog senders, because:
o In case (1) above, every time in tcp_write_xmit() we have enough
cwnd to send a full-sized skb, we are not fully using the cwnd
(because cwnd is not a multiple of the TSO skb size). So every time we
send data, we are not cwnd limited, and so in the cwnd-limited
tracking code in tcp_cwnd_validate() we mark ourselves as not
cwnd-limited.
o In case (2) above, every time in tcp_write_xmit() that we try to
transmit the "remainder" of the cwnd but defer, we set the local
variable is_cwnd_limited to true, but we do not send any packets, so
sent_pkts is zero, so we don't call the cwnd-limited logic to update
tp->is_cwnd_limited.
Fixes: ca8a22634381 ("tcp: make cwnd-limited checks measurement-based, and gentler")
Reported-by: Ingemar Johansson <ingemar.s.johansson@ericsson.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20201209035759.1225145-1-ncardwell.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit a770bf515613c6e12ae904c3593e26016de99448 ]
Syzbot reported a stack overflow in bitmap_from_arr32() called from
ethnl_parse_bitset() when bitset from netlink message is longer than
target bitmap length. While ethnl_compact_sanity_checks() makes sure that
trailing part is all zeros (i.e. the request does not try to touch bits
kernel does not recognize), we also need to cap change_bits to nbits so
that we don't try to write past the prepared bitmaps.
Fixes: 88db6d1e4f62 ("ethtool: add ethnl_parse_bitset() helper")
Reported-by: syzbot+9d39fa49d4df294aab93@syzkaller.appspotmail.com
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Link: https://lore.kernel.org/r/3487ee3a98e14cd526f55b6caaa959d2dcbcad9f.1607465316.git.mkubecek@suse.cz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 72d05c00d7ecda85df29abd046da7e41cc071c17 ]
Before commit a337531b942b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
small tcp_rmem[1] values were overridden by tcp_fixup_rcvbuf() to accommodate various MSS.
This is no longer the case, and Hazem Mohamed Abuelfotoh reported
that DRS would not work for MTU 9000 endpoints receiving regular (1500 bytes) frames.
Root cause is that tcp_init_buffer_space() uses tp->rcv_wnd for upper limit
of rcvq_space.space computation, while it can select later a smaller
value for tp->rcv_ssthresh and tp->window_clamp.
ss -temoi on receiver would show :
skmem:(r0,rb131072,t0,tb46080,f0,w0,o0,bl0,d0) rcv_space:62496 rcv_ssthresh:56596
This means that TCP can not increase its window in tcp_grow_window(),
and that DRS can never kick.
Fix this by making sure that rcvq_space.space is not bigger than number of bytes
that can be held in TCP receive queue.
People unable/unwilling to change their kernel can work around this issue by
selecting a bigger tcp_rmem[1] value as in :
echo "4096 196608 6291456" >/proc/sys/net/ipv4/tcp_rmem
Based on an initial report and patch from Hazem Mohamed Abuelfotoh
https://lore.kernel.org/netdev/20201204180622.14285-1-abuehaze@amazon.com/
Fixes: a337531b942b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
Fixes: 041a14d26715 ("tcp: start receiver buffer autotuning sooner")
Reported-by: Hazem Mohamed Abuelfotoh <abuehaze@amazon.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit f55628b3e7648198e9c072b52080c5dea8678adf ]
When do cat /proc/net/netstat, the output isn't append with a new line, it looks like this:
[root@localhost ~]# cat /proc/net/netstat
...
MPTcpExt: 0 0 0 0 0 0 0 0 0 0 0 0 0[root@localhost ~]#
This is because in mptcp_seq_show(), if mptcp isn't in use, net->mib.mptcp_statistics is NULL,
so it just puts all 0 after "MPTcpExt:", and return, forgot the '\n'.
After this patch:
[root@localhost ~]# cat /proc/net/netstat
...
MPTcpExt: 0 0 0 0 0 0 0 0 0 0 0 0 0
[root@localhost ~]#
Fixes: fc518953bc9c8d7d ("mptcp: add and use MIB counter infrastructure")
Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn>
Acked-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/142e2fd9-58d9-bb13-fb75-951cccc2331e@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 851d0a73c90e6c8c63fef106c6c1e73df7e05d9d ]
When enabling multicast snooping, bridge module deadlocks on multicast_lock
if 1) IPv6 is enabled, and 2) there is an existing querier on the same L2
network.
The deadlock was caused by the following sequence: While holding the lock,
br_multicast_open calls br_multicast_join_snoopers, which eventually causes
IP stack to (attempt to) send out a Listener Report (in igmp6_join_group).
Since the destination Ethernet address is a multicast address, br_dev_xmit
feeds the packet back to the bridge via br_multicast_rcv, which in turn
calls br_multicast_add_group, which then deadlocks on multicast_lock.
The fix is to move the call br_multicast_join_snoopers outside of the
critical section. This works since br_multicast_join_snoopers only deals
with IP and does not modify any multicast data structures of the bridge,
so there's no need to hold the lock.
Steps to reproduce:
1. sysctl net.ipv6.conf.all.force_mld_version=1
2. have another querier
3. ip link set dev bridge type bridge mcast_snooping 0 && \
ip link set dev bridge type bridge mcast_snooping 1 < deadlock >
A typical call trace looks like the following:
[ 936.251495] _raw_spin_lock+0x5c/0x68
[ 936.255221] br_multicast_add_group+0x40/0x170 [bridge]
[ 936.260491] br_multicast_rcv+0x7ac/0xe30 [bridge]
[ 936.265322] br_dev_xmit+0x140/0x368 [bridge]
[ 936.269689] dev_hard_start_xmit+0x94/0x158
[ 936.273876] __dev_queue_xmit+0x5ac/0x7f8
[ 936.277890] dev_queue_xmit+0x10/0x18
[ 936.281563] neigh_resolve_output+0xec/0x198
[ 936.285845] ip6_finish_output2+0x240/0x710
[ 936.290039] __ip6_finish_output+0x130/0x170
[ 936.294318] ip6_output+0x6c/0x1c8
[ 936.297731] NF_HOOK.constprop.0+0xd8/0xe8
[ 936.301834] igmp6_send+0x358/0x558
[ 936.305326] igmp6_join_group.part.0+0x30/0xf0
[ 936.309774] igmp6_group_added+0xfc/0x110
[ 936.313787] __ipv6_dev_mc_inc+0x1a4/0x290
[ 936.317885] ipv6_dev_mc_inc+0x10/0x18
[ 936.321677] br_multicast_open+0xbc/0x110 [bridge]
[ 936.326506] br_multicast_toggle+0xec/0x140 [bridge]
Fixes: 4effd28c1245 ("bridge: join all-snoopers multicast address")
Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20201204235628.50653-1-Joseph.Huang@garmin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 10c678bd0a035ac2c64a9b26b222f20556227a53 ]
Guillaume noticed that: for segments udp_queue_rcv_one_skb() returns the
proto, and it should pass "ret" unmodified to ip_protocol_deliver_rcu().
Otherwize, with a negtive value passed, it will underflow inet_protos.
This can be reproduced with IPIP FOU:
# ip fou add port 5555 ipproto 4
# ethtool -K eth1 rx-gro-list on
Fixes: cf329aa42b66 ("udp: cope with UDP GRO packet misdirection")
Reported-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit ee4f52a8de2c6f78b01f10b4c330867d88c1653a ]
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: f8ed289fab84 ("bridge: vlan: use br_vlan_(get|put)_master to deal with refcounts")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/1607071737-33875-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 905b2032fa424f253d9126271439cc1db2b01130 ]
If tbl_mpp can not be allocated, we call mesh_table_free(tbl_path)
while tbl_path rhashtable has not yet been initialized, which causes
panics.
Simply factorize the rhashtable_init() call into mesh_table_alloc()
WARNING: CPU: 1 PID: 8474 at kernel/workqueue.c:3040 __flush_work kernel/workqueue.c:3040 [inline]
WARNING: CPU: 1 PID: 8474 at kernel/workqueue.c:3040 __cancel_work_timer+0x514/0x540 kernel/workqueue.c:3136
Modules linked in:
CPU: 1 PID: 8474 Comm: syz-executor663 Not tainted 5.10.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__flush_work kernel/workqueue.c:3040 [inline]
RIP: 0010:__cancel_work_timer+0x514/0x540 kernel/workqueue.c:3136
Code: 5d c3 e8 bf ae 29 00 0f 0b e9 f0 fd ff ff e8 b3 ae 29 00 0f 0b 43 80 3c 3e 00 0f 85 31 ff ff ff e9 34 ff ff ff e8 9c ae 29 00 <0f> 0b e9 dc fe ff ff 89 e9 80 e1 07 80 c1 03 38 c1 0f 8c 7d fd ff
RSP: 0018:ffffc9000165f5a0 EFLAGS: 00010293
RAX: ffffffff814b7064 RBX: 0000000000000001 RCX: ffff888021c80000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff888024039ca0 R08: dffffc0000000000 R09: fffffbfff1dd3e64
R10: fffffbfff1dd3e64 R11: 0000000000000000 R12: 1ffff920002cbebd
R13: ffff888024039c88 R14: 1ffff11004807391 R15: dffffc0000000000
FS: 0000000001347880(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000140 CR3: 000000002cc0a000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
rhashtable_free_and_destroy+0x25/0x9c0 lib/rhashtable.c:1137
mesh_table_free net/mac80211/mesh_pathtbl.c:69 [inline]
mesh_pathtbl_init+0x287/0x2e0 net/mac80211/mesh_pathtbl.c:785
ieee80211_mesh_init_sdata+0x2ee/0x530 net/mac80211/mesh.c:1591
ieee80211_setup_sdata+0x733/0xc40 net/mac80211/iface.c:1569
ieee80211_if_add+0xd5c/0x1cd0 net/mac80211/iface.c:1987
ieee80211_add_iface+0x59/0x130 net/mac80211/cfg.c:125
rdev_add_virtual_intf net/wireless/rdev-ops.h:45 [inline]
nl80211_new_interface+0x563/0xb40 net/wireless/nl80211.c:3855
genl_family_rcv_msg_doit net/netlink/genetlink.c:739 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
genl_rcv_msg+0xe4e/0x1280 net/netlink/genetlink.c:800
netlink_rcv_skb+0x190/0x3a0 net/netlink/af_netlink.c:2494
genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
netlink_unicast+0x780/0x930 net/netlink/af_netlink.c:1330
netlink_sendmsg+0x9a8/0xd40 net/netlink/af_netlink.c:1919
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg net/socket.c:671 [inline]
____sys_sendmsg+0x519/0x800 net/socket.c:2353
___sys_sendmsg net/socket.c:2407 [inline]
__sys_sendmsg+0x2b1/0x360 net/socket.c:2440
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 60854fd94573 ("mac80211: mesh: convert path table to rhashtable")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Link: https://lore.kernel.org/r/20201204162428.2583119-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit b410f04eb5b482b5efc4eee90de81ad35d3d923b ]
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: d15662682db2 ("ipv4: Allow ipv6 gateway with ipv4 routes")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/1607071695-33740-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 4eef8b1f36f2ff06966b8f7c2143ef0c447877de ]
with the following tdc testcase:
83be: (qdisc, fq_pie) Create FQ-PIE with invalid number of flows
as fq_pie_init() fails, fq_pie_destroy() is called to clean up. Since the
timer is not yet initialized, it's possible to observe a splat like this:
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 0 PID: 975 Comm: tc Not tainted 5.10.0-rc4+ #298
Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
Call Trace:
dump_stack+0x99/0xcb
register_lock_class+0x12dd/0x1750
__lock_acquire+0xfe/0x3970
lock_acquire+0x1c8/0x7f0
del_timer_sync+0x49/0xd0
fq_pie_destroy+0x3f/0x80 [sch_fq_pie]
qdisc_create+0x916/0x1160
tc_modify_qdisc+0x3c4/0x1630
rtnetlink_rcv_msg+0x346/0x8e0
netlink_unicast+0x439/0x630
netlink_sendmsg+0x719/0xbf0
sock_sendmsg+0xe2/0x110
____sys_sendmsg+0x5ba/0x890
___sys_sendmsg+0xe9/0x160
__sys_sendmsg+0xd3/0x170
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
[...]
ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0
WARNING: CPU: 0 PID: 975 at lib/debugobjects.c:508 debug_print_object+0x162/0x210
[...]
Call Trace:
debug_object_assert_init+0x268/0x380
try_to_del_timer_sync+0x6a/0x100
del_timer_sync+0x9e/0xd0
fq_pie_destroy+0x3f/0x80 [sch_fq_pie]
qdisc_create+0x916/0x1160
tc_modify_qdisc+0x3c4/0x1630
rtnetlink_rcv_msg+0x346/0x8e0
netlink_rcv_skb+0x120/0x380
netlink_unicast+0x439/0x630
netlink_sendmsg+0x719/0xbf0
sock_sendmsg+0xe2/0x110
____sys_sendmsg+0x5ba/0x890
___sys_sendmsg+0xe9/0x160
__sys_sendmsg+0xd3/0x170
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
fix it moving timer_setup() before any failure, like it was done on 'red'
with former commit 608b4adab178 ("net_sched: initialize timer earlier in
red_init()").
Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Link: https://lore.kernel.org/r/2e78e01c504c633ebdff18d041833cf2e079a3a4.1607020450.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 4bc3c8dc9f5f1eff0d3bfa59491383ac11308b6b ]
kmemleak report a memory leak as follows:
BUG: memory leak
unreferenced object 0xffff8880759ea000 (size 256):
backtrace:
[<00000000c0bf2deb>] kmem_cache_zalloc include/linux/slab.h:656 [inline]
[<00000000c0bf2deb>] __proc_create+0x23d/0x7d0 fs/proc/generic.c:421
[<000000009d718d02>] proc_create_reg+0x8e/0x140 fs/proc/generic.c:535
[<0000000097bbfc4f>] proc_create_net_data+0x8c/0x1b0 fs/proc/proc_net.c:126
[<00000000652480fc>] ip_vs_control_net_init+0x308/0x13a0 net/netfilter/ipvs/ip_vs_ctl.c:4169
[<000000004c927ebe>] __ip_vs_init+0x211/0x400 net/netfilter/ipvs/ip_vs_core.c:2429
[<00000000aa6b72d9>] ops_init+0xa8/0x3c0 net/core/net_namespace.c:151
[<00000000153fd114>] setup_net+0x2de/0x7e0 net/core/net_namespace.c:341
[<00000000be4e4f07>] copy_net_ns+0x27d/0x530 net/core/net_namespace.c:482
[<00000000f1c23ec9>] create_new_namespaces+0x382/0xa30 kernel/nsproxy.c:110
[<00000000098a5757>] copy_namespaces+0x2e6/0x3b0 kernel/nsproxy.c:179
[<0000000026ce39e9>] copy_process+0x220a/0x5f00 kernel/fork.c:2072
[<00000000b71f4efe>] _do_fork+0xc7/0xda0 kernel/fork.c:2428
[<000000002974ee96>] __do_sys_clone3+0x18a/0x280 kernel/fork.c:2703
[<0000000062ac0a4d>] do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
[<0000000093f1ce2c>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
In the error path of ip_vs_control_net_init(), remove_proc_entry() needs
to be called to remove the added proc entry, otherwise a memory leak
will occur.
Also, add some '#ifdef CONFIG_PROC_FS' because proc_create_net* return NULL
when PROC is not used.
Fixes: b17fc9963f83 ("IPVS: netns, ip_vs_stats and its procfs")
Fixes: 61b1ab4583e2 ("IPVS: netns, add basic init per netns.")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 992b03b88e36254e26e9a4977ab948683e21bd9f ]
When a packet is fragmented by batman-adv, the original batman-adv header
is not modified. Only a new fragmentation is inserted between the original
one and the ethernet header. The code must therefore make sure that it has
a writable region of this size in the skbuff head.
But it is not useful to always reallocate the skbuff by this size even when
there would be more than enough headroom still in the skb. The reallocation
is just to costly during in this codepath.
Fixes: ee75ed88879a ("batman-adv: Fragment and send skbs larger than mtu")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit c5cbfc87558168ef4c3c27ce36eba6b83391db19 ]
The batadv net_device is trying to propagate the needed_headroom and
needed_tailroom from the lower devices. This is needed to avoid cost
intensive reallocations using pskb_expand_head during the transmission.
But the fragmentation code split the skb's without adding extra room at the
end/beginning of the various fragments. This reduced the performance of
transmissions over complex scenarios (batadv on vxlan on wireguard) because
the lower devices had to perform the reallocations at least once.
Fixes: ee75ed88879a ("batman-adv: Fragment and send skbs larger than mtu")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 4ca23e2c2074465bff55ea14221175fecdf63c5f ]
If a batman-adv packets has to be fragmented, then the original batman-adv
packet header is not stripped away. Instead, only a new header is added in
front of the packet after it was split.
This size must be considered to avoid cost intensive reallocations during
the transmission through the various device layers.
Fixes: 7bca68c7844b ("batman-adv: Add lower layer needed_(head|tail)room to own ones")
Reported-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 36ccdf85829a7dd6936dba5d02fa50138471f0d3 ]
Commit 642e450b6b59 ("xsk: Do not discard packet when NETDEV_TX_BUSY")
addressed the problem that packets were discarded from the Tx AF_XDP
ring, when the driver returned NETDEV_TX_BUSY. Part of the fix was
bumping the skbuff reference count, so that the buffer would not be
freed by dev_direct_xmit(). A reference count larger than one means
that the skbuff is "shared", which is not the case.
If the "shared" skbuff is sent to the generic XDP receive path,
netif_receive_generic_xdp(), and pskb_expand_head() is entered the
BUG_ON(skb_shared(skb)) will trigger.
This patch adds a variant to dev_direct_xmit(), __dev_direct_xmit(),
where a user can select the skbuff free policy. This allows AF_XDP to
avoid bumping the reference count, but still keep the NETDEV_TX_BUSY
behavior.
Fixes: 642e450b6b59 ("xsk: Do not discard packet when NETDEV_TX_BUSY")
Reported-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20201123175600.146255-1-bjorn.topel@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit a5d45bc0dc50f9dd83703510e9804d813a9cac32 upstream.
Userspace might match on prefix bytes of header fields if they are on
the byte boundary, this requires that the mask is adjusted accordingly.
Use NFT_OFFLOAD_MATCH_EXACT() for meta since prefix byte matching is not
allowed for this type of selector.
The bitwise expression might be optimized out by userspace, hence the
kernel needs to infer the prefix from the number of payload bytes to
match on. This patch adds nft_payload_offload_mask() to calculate the
bitmask to match on the prefix.
Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 3c78e9e0d33a27ab8050e4492c03c6a1f8d0ed6b upstream.
This patch adds nft_flow_rule_set_addr_type() to set the address type
from the nft_payload expression accordingly.
If the address type is not set in the control dissector then a rule that
matches either on source or destination IP address does not work.
After this patch, nft hardware offload generates the flow dissector
configuration as tc-flower does to match on an IP address.
This patch has been also tested functionally to make sure packets are
filtered out by the NIC.
This is also getting the code aligned with the existing netfilter flow
offload infrastructure which is also setting the control dissector.
Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit c0700dfa2cae44c033ed97dade8a2679c7d22a9d upstream.
There are reports wrt lockdep splat in nftables, e.g.:
------------[ cut here ]------------
WARNING: CPU: 2 PID: 31416 at net/netfilter/nf_tables_api.c:622
lockdep_nfnl_nft_mutex_not_held+0x28/0x38 [nf_tables]
...
These are caused by an earlier, unrelated bug such as a n ABBA deadlock
in a different subsystem.
In such an event, lockdep is disabled and lockdep_is_held returns true
unconditionally. This then causes the WARN() in nf_tables.
Make the WARN conditional on lockdep still active to avoid this.
Fixes: f102d66b335a417 ("netfilter: nf_tables: use dedicated mutex to guard transactions")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Link: https://lore.kernel.org/linux-kselftest/CA+G9fYvFUpODs+NkSYcnwKnXm62tmP=ksLeBPmB+KFrB2rvCtQ@mail.gmail.com/
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit d966ddcc38217a6110a6a0ff37ad2dee7d42e23e upstream.
In the commit fdeba99b1e58
("tipc: fix use-after-free in tipc_bcast_get_mode"), we're trying
to make sure the tipc_net_finalize_work work item finished if it
enqueued. But calling flush_scheduled_work() is not just affecting
above work item but either any scheduled work. This has turned out
to be overkill and caused to deadlock as syzbot reported:
======================================================
WARNING: possible circular locking dependency detected
5.9.0-rc2-next-20200828-syzkaller #0 Not tainted
------------------------------------------------------
kworker/u4:6/349 is trying to acquire lock:
ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: flush_workqueue+0xe1/0x13e0 kernel/workqueue.c:2777
but task is already holding lock:
ffffffff8a879430 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x9b/0xb10 net/core/net_namespace.c:565
[...]
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(pernet_ops_rwsem);
lock(&sb->s_type->i_mutex_key#13);
lock(pernet_ops_rwsem);
lock((wq_completion)events);
*** DEADLOCK ***
[...]
v1:
To fix the original issue, we replace above calling by introducing
a bit flag. When a namespace cleaned-up, bit flag is set to zero and:
- tipc_net_finalize functionial just does return immediately.
- tipc_net_finalize_work does not enqueue into the scheduled work queue.
v2:
Use cancel_work_sync() helper to make sure ONLY the
tipc_net_finalize_work() stopped before releasing bcbase object.
Reported-by: syzbot+d5aa7e0385f6a5d0f4fd@syzkaller.appspotmail.com
Fixes: fdeba99b1e58 ("tipc: fix use-after-free in tipc_bcast_get_mode")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 68ad89de918e1c5a79c9c56127e5e31741fd517e upstream.
syzbot found that we are not validating user input properly
before copying 16 bytes [1].
Using NLA_BINARY in ipaddr_policy[] for IPv6 address is not correct,
since it ensures at most 16 bytes were provided.
We should instead make sure user provided exactly 16 bytes.
In old kernels (before v4.20), fix would be to remove the NLA_BINARY,
since NLA_POLICY_EXACT_LEN() was not yet available.
[1]
BUG: KMSAN: uninit-value in hash_ip6_add+0x1cba/0x3a50 net/netfilter/ipset/ip_set_hash_gen.h:892
CPU: 1 PID: 11611 Comm: syz-executor.0 Not tainted 5.10.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x21c/0x280 lib/dump_stack.c:118
kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
__msan_warning+0x5f/0xa0 mm/kmsan/kmsan_instr.c:197
hash_ip6_add+0x1cba/0x3a50 net/netfilter/ipset/ip_set_hash_gen.h:892
hash_ip6_uadt+0x976/0xbd0 net/netfilter/ipset/ip_set_hash_ip.c:267
call_ad+0x329/0xd00 net/netfilter/ipset/ip_set_core.c:1720
ip_set_ad+0x111f/0x1440 net/netfilter/ipset/ip_set_core.c:1808
ip_set_uadd+0xf6/0x110 net/netfilter/ipset/ip_set_core.c:1833
nfnetlink_rcv_msg+0xc7d/0xdf0 net/netfilter/nfnetlink.c:252
netlink_rcv_skb+0x70a/0x820 net/netlink/af_netlink.c:2494
nfnetlink_rcv+0x4f0/0x4380 net/netfilter/nfnetlink.c:600
netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
netlink_unicast+0x11da/0x14b0 net/netlink/af_netlink.c:1330
netlink_sendmsg+0x173c/0x1840 net/netlink/af_netlink.c:1919
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg net/socket.c:671 [inline]
____sys_sendmsg+0xc7a/0x1240 net/socket.c:2353
___sys_sendmsg net/socket.c:2407 [inline]
__sys_sendmsg+0x6d5/0x830 net/socket.c:2440
__do_sys_sendmsg net/socket.c:2449 [inline]
__se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45deb9
Code: 0d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fe2e503fc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000029ec0 RCX: 000000000045deb9
RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003
RBP: 000000000118bf60 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118bf2c
R13: 000000000169fb7f R14: 00007fe2e50409c0 R15: 000000000118bf2c
Uninit was stored to memory at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline]
kmsan_internal_chain_origin+0xad/0x130 mm/kmsan/kmsan.c:289
__msan_chain_origin+0x57/0xa0 mm/kmsan/kmsan_instr.c:147
ip6_netmask include/linux/netfilter/ipset/pfxlen.h:49 [inline]
hash_ip6_netmask net/netfilter/ipset/ip_set_hash_ip.c:185 [inline]
hash_ip6_uadt+0xb1c/0xbd0 net/netfilter/ipset/ip_set_hash_ip.c:263
call_ad+0x329/0xd00 net/netfilter/ipset/ip_set_core.c:1720
ip_set_ad+0x111f/0x1440 net/netfilter/ipset/ip_set_core.c:1808
ip_set_uadd+0xf6/0x110 net/netfilter/ipset/ip_set_core.c:1833
nfnetlink_rcv_msg+0xc7d/0xdf0 net/netfilter/nfnetlink.c:252
netlink_rcv_skb+0x70a/0x820 net/netlink/af_netlink.c:2494
nfnetlink_rcv+0x4f0/0x4380 net/netfilter/nfnetlink.c:600
netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
netlink_unicast+0x11da/0x14b0 net/netlink/af_netlink.c:1330
netlink_sendmsg+0x173c/0x1840 net/netlink/af_netlink.c:1919
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg net/socket.c:671 [inline]
____sys_sendmsg+0xc7a/0x1240 net/socket.c:2353
___sys_sendmsg net/socket.c:2407 [inline]
__sys_sendmsg+0x6d5/0x830 net/socket.c:2440
__do_sys_sendmsg net/socket.c:2449 [inline]
__se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Uninit was stored to memory at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline]
kmsan_internal_chain_origin+0xad/0x130 mm/kmsan/kmsan.c:289
kmsan_memcpy_memmove_metadata+0x25e/0x2d0 mm/kmsan/kmsan.c:226
kmsan_memcpy_metadata+0xb/0x10 mm/kmsan/kmsan.c:246
__msan_memcpy+0x46/0x60 mm/kmsan/kmsan_instr.c:110
ip_set_get_ipaddr6+0x2cb/0x370 net/netfilter/ipset/ip_set_core.c:310
hash_ip6_uadt+0x439/0xbd0 net/netfilter/ipset/ip_set_hash_ip.c:255
call_ad+0x329/0xd00 net/netfilter/ipset/ip_set_core.c:1720
ip_set_ad+0x111f/0x1440 net/netfilter/ipset/ip_set_core.c:1808
ip_set_uadd+0xf6/0x110 net/netfilter/ipset/ip_set_core.c:1833
nfnetlink_rcv_msg+0xc7d/0xdf0 net/netfilter/nfnetlink.c:252
netlink_rcv_skb+0x70a/0x820 net/netlink/af_netlink.c:2494
nfnetlink_rcv+0x4f0/0x4380 net/netfilter/nfnetlink.c:600
netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
netlink_unicast+0x11da/0x14b0 net/netlink/af_netlink.c:1330
netlink_sendmsg+0x173c/0x1840 net/netlink/af_netlink.c:1919
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg net/socket.c:671 [inline]
____sys_sendmsg+0xc7a/0x1240 net/socket.c:2353
___sys_sendmsg net/socket.c:2407 [inline]
__sys_sendmsg+0x6d5/0x830 net/socket.c:2440
__do_sys_sendmsg net/socket.c:2449 [inline]
__se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline]
kmsan_internal_poison_shadow+0x5c/0xf0 mm/kmsan/kmsan.c:104
kmsan_slab_alloc+0x8d/0xe0 mm/kmsan/kmsan_hooks.c:76
slab_alloc_node mm/slub.c:2906 [inline]
__kmalloc_node_track_caller+0xc61/0x15f0 mm/slub.c:4512
__kmalloc_reserve net/core/skbuff.c:142 [inline]
__alloc_skb+0x309/0xae0 net/core/skbuff.c:210
alloc_skb include/linux/skbuff.h:1094 [inline]
netlink_alloc_large_skb net/netlink/af_netlink.c:1176 [inline]
netlink_sendmsg+0xdb8/0x1840 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg net/socket.c:671 [inline]
____sys_sendmsg+0xc7a/0x1240 net/socket.c:2353
___sys_sendmsg net/socket.c:2407 [inline]
__sys_sendmsg+0x6d5/0x830 net/socket.c:2440
__do_sys_sendmsg net/socket.c:2449 [inline]
__se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: a7b4f989a629 ("netfilter: ipset: IP set core support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
operation sanity check
commit d73ff9b7c4eacaba0fd956d14882bcae970f8307 upstream.
To detect potential bugs in CAN protocol implementations (double removal of
receiver entries) a WARN() statement has been used if no matching list item was
found for removal.
The fault injection issued by syzkaller was able to create a situation where
the closing of a socket runs simultaneously to the notifier call chain for
removing the CAN network device in use.
This case is very unlikely in real life but it doesn't break anything.
Therefore we just replace the WARN() statement with pr_warn() to preserve the
notification for the CAN protocol development.
Reported-by: syzbot+381d06e0c8eaacb8706f@syzkaller.appspotmail.com
Reported-by: syzbot+d0ddd88c9a7432f041e6@syzkaller.appspotmail.com
Reported-by: syzbot+76d62d3b8162883c7d11@syzkaller.appspotmail.com
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/r/20201126192140.14350-1-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 9608fa653059c3f72faab0c148ac8773c46e7314 ]
when 'act_mpls' is used to mangle the LSE, the current value is read from
the packet dereferencing 4 bytes at mpls_hdr(): ensure that the label is
contained in the skb "linear" area.
Found by code inspection.
v2:
- use MPLS_HLEN instead of sizeof(new_lse), thanks to Jakub Kicinski
Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Link: https://lore.kernel.org/r/3243506cba43d14858f3bd21ee0994160e44d64a.1606987058.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 43c13605bad44b8abbc9776d6e63f62ccb7a47d6 ]
when openvswitch is configured to mangle the LSE, the current value is
read from the packet dereferencing 4 bytes at mpls_hdr(): ensure that
the label is contained in the skb "linear" area.
Found by code inspection.
Fixes: d27cf5c59a12 ("net: core: add MPLS update core helper and use in OvS")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/aa099f245d93218b84b5c056b67b6058ccf81a66.1606987185.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 13de4ed9e3a9ccbe54d05f7d5c773f69ecaf6c64 ]
skb_mpls_dec_ttl() reads the LSE without ensuring that it is contained in
the skb "linear" area. Fix this calling pskb_may_pull() before reading the
current ttl.
Found by code inspection.
Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC")
Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/53659f28be8bc336c113b5254dc637cc76bbae91.1606987074.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 6ee50c8e262a0f0693dad264c3c99e30e6442a56 ]
The .x25_addr[] address comes from the user and is not necessarily
NUL terminated. This leads to a couple problems. The first problem is
that the strlen() in x25_bind() can read beyond the end of the buffer.
The second problem is more subtle and could result in memory corruption.
The call tree is:
x25_connect()
--> x25_write_internal()
--> x25_addr_aton()
The .x25_addr[] buffers are copied to the "addresses" buffer from
x25_write_internal() so it will lead to stack corruption.
Verify that the strings are NUL terminated and return -EINVAL if they
are not.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Fixes: a9288525d2ae ("X25: Dont let x25_bind use addresses containing characters")
Reported-by: "kiyin(尹亮)" <kiyin@tencent.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Martin Schiller <ms@dev.tdt.de>
Link: https://lore.kernel.org/r/X8ZeAKm8FnFpN//B@mwanda
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 832ba596494b2c9eac7760259eff2d8b7dcad0ee ]
syzkaller managed to crash the kernel using an NBMA ip6gre interface. I
could reproduce it creating an NBMA ip6gre interface and forwarding
traffic to it:
skbuff: skb_under_panic: text:ffffffff8250e927 len:148 put:44 head:ffff8c03c7a33
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:109!
Call Trace:
skb_push+0x10/0x10
ip6gre_header+0x47/0x1b0
neigh_connected_output+0xae/0xf0
ip6gre tunnel provides its own header_ops->create, and sets it
conditionally when initializing the tunnel in NBMA mode. When
header_ops->create is used, dev->hard_header_len should reflect the
length of the header created. Otherwise, when not used,
dev->needed_headroom should be used.
Fixes: eb95f52fc72d ("net: ipv6_gre: Fix GRO to work on IPv6 over GRE tap")
Cc: Maria Pasechnik <mariap@mellanox.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20201130161911.464106-1-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 0643334902fcdc770e2d9555811200213339a3f6 ]
In commit 682cd3cf946b6
("tipc: confgiure and apply UDP bearer MTU on running links"), we
introduced a function to change UDP bearer MTU and applied this new value
across existing per-link. However, we did not apply this new MTU value at
node level. This lead to packet dropped at link level if its size is
greater than new MTU value.
To fix this issue, we also apply this new MTU value for node level.
Fixes: 682cd3cf946b6 ("tipc: confgiure and apply UDP bearer MTU on running links")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Link: https://lore.kernel.org/r/20201130025544.3602-1-hoang.h.le@dektech.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 1ebf179037cb46c19da3a9c1e2ca16e7a754b75e ]
When inet_rtm_getroute() was converted to use the RCU variants of
ip_route_input() and ip_route_output_key(), the TOS parameters
stopped being masked with IPTOS_RT_MASK before doing the route lookup.
As a result, "ip route get" can return a different route than what
would be used when sending real packets.
For example:
$ ip route add 192.0.2.11/32 dev eth0
$ ip route add unreachable 192.0.2.11/32 tos 2
$ ip route get 192.0.2.11 tos 2
RTNETLINK answers: No route to host
But, packets with TOS 2 (ECT(0) if interpreted as an ECN bit) would
actually be routed using the first route:
$ ping -c 1 -Q 2 192.0.2.11
PING 192.0.2.11 (192.0.2.11) 56(84) bytes of data.
64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=0.173 ms
--- 192.0.2.11 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.173/0.173/0.173/0.000 ms
This patch re-applies IPTOS_RT_MASK in inet_rtm_getroute(), to
return results consistent with real route lookups.
Fixes: 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/b2d237d08317ca55926add9654a48409ac1b8f5b.1606412894.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 44f64f23bae2f0fad25503bc7ab86cd08d04cd47 ]
Netfilter changes PACKET_OTHERHOST to PACKET_HOST before invoking the
hooks as, while it's an expected value for a bridge, routing expects
PACKET_HOST. The change is undone later on after hook traversal. This
can be seen with pairs of functions updating skb>pkt_type and then
reverting it to its original value:
For hook NF_INET_PRE_ROUTING:
setup_pre_routing / br_nf_pre_routing_finish
For hook NF_INET_FORWARD:
br_nf_forward_ip / br_nf_forward_finish
But the third case where netfilter does this, for hook
NF_INET_POST_ROUTING, the packet type is changed in br_nf_post_routing
but never reverted. A comment says:
/* We assume any code from br_dev_queue_push_xmit onwards doesn't care
* about the value of skb->pkt_type. */
But when having a tunnel (say vxlan) attached to a bridge we have the
following call trace:
br_nf_pre_routing
br_nf_pre_routing_ipv6
br_nf_pre_routing_finish
br_nf_forward_ip
br_nf_forward_finish
br_nf_post_routing <- pkt_type is updated to PACKET_HOST
br_nf_dev_queue_xmit <- but not reverted to its original value
vxlan_xmit
vxlan_xmit_one
skb_tunnel_check_pmtu <- a check on pkt_type is performed
In this specific case, this creates issues such as when an ICMPv6 PTB
should be sent back. When CONFIG_BRIDGE_NETFILTER is enabled, the PTB
isn't sent (as skb_tunnel_check_pmtu checks if pkt_type is PACKET_HOST
and returns early).
If the comment is right and no one cares about the value of
skb->pkt_type after br_dev_queue_push_xmit (which isn't true), resetting
it to its original value should be safe.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/20201123174902.622102-1-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit d549699048b4b5c22dd710455bcdb76966e55aa3 ]
In the patchset merged by commit b9fcf0a0d826
("Merge branch 'support-AF_PACKET-for-layer-3-devices'") L3 devices which
did not have header_ops were given one for the purpose of protocol parsing
on af_packet transmit path.
That change made af_packet receive path regard these devices as having a
visible L3 header and therefore aligned incoming skb->data to point to the
skb's mac_header. Some devices, such as ipip, xfrmi, and others, do not
reset their mac_header prior to ingress and therefore their incoming
packets became malformed.
Ideally these devices would reset their mac headers, or af_packet would be
able to rely on dev->hard_header_len being 0 for such cases, but it seems
this is not the case.
Fix by changing af_packet RX ll visibility criteria to include the
existence of a '.create()' header operation, which is used when creating
a device hard header - via dev_hard_header() - by upper layers, and does
not exist in these L3 devices.
As this predicate may be useful in other situations, add it as a common
dev_has_header() helper in netdevice.h.
Fixes: b9fcf0a0d826 ("Merge branch 'support-AF_PACKET-for-layer-3-devices'")
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20201121062817.3178900-1-eyal.birger@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit d3ab78858f1451351221061a1c365495df196500 ]
If an msk listener receives an MPJ carrying an invalid token, it
will zero the request socket msk entry. That should later
cause fallback and subflow reset - as per RFC - at
subflow_syn_recv_sock() time due to failing hmac validation.
Since commit 4cf8b7e48a09 ("subflow: introduce and use
mptcp_can_accept_new_subflow()"), we unconditionally dereference
- in mptcp_can_accept_new_subflow - the subflow request msk
before performing hmac validation. In the above scenario we
hit a NULL ptr dereference.
Address the issue doing the hmac validation earlier.
Fixes: 4cf8b7e48a09 ("subflow: introduce and use mptcp_can_accept_new_subflow()")
Tested-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/03b2cfa3ac80d8fc18272edc6442a9ddf0b1e34e.1606400227.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 3fe356d58efae54dade9ec94ea7c919ed20cf4db ]
Starting from commit 8692cefc433f ("virtio_vsock: Fix race condition
in virtio_transport_recv_pkt"), we discard packets in
virtio_transport_recv_pkt() if the socket has been released.
When the socket is connected, we schedule a delayed work to wait the
RST packet from the other peer, also if SHUTDOWN_MASK is set in
sk->sk_shutdown.
This is done to complete the virtio-vsock shutdown algorithm, releasing
the port assigned to the socket definitively only when the other peer
has consumed all the packets.
If we discard the RST packet received, the socket will be closed only
when the VSOCK_CLOSE_TIMEOUT is reached.
Sergio discovered the issue while running ab(1) HTTP benchmark using
libkrun [1] and observing a latency increase with that commit.
To avoid this issue, we discard packet only if the socket is really
closed (SOCK_DONE flag is set).
We also set SOCK_DONE in virtio_transport_release() when we don't need
to wait any packets from the other peer (we didn't schedule the delayed
work). In this case we remove the socket from the vsock lists, releasing
the port assigned.
[1] https://github.com/containers/libkrun
Fixes: 8692cefc433f ("virtio_vsock: Fix race condition in virtio_transport_recv_pkt")
Cc: justin.he@arm.com
Reported-by: Sergio Lopez <slp@redhat.com>
Tested-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jia He <justin.he@arm.com>
Link: https://lore.kernel.org/r/20201120104736.73749-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 55472017a4219ca965a957584affdb17549ae4a4 ]
When setting congestion control via a BPF program it is seen that the
SYN/ACK for packets within a given flow will not include the ECT0 flag. A
bit of simple printk debugging shows that when this is configured without
BPF we will see the value INET_ECN_xmit value initialized in
tcp_assign_congestion_control however when we configure this via BPF the
socket is in the closed state and as such it isn't configured, and I do not
see it being initialized when we transition the socket into the listen
state. The result of this is that the ECT0 bit is configured based on
whatever the default state is for the socket.
Any easy way to reproduce this is to monitor the following with tcpdump:
tools/testing/selftests/bpf/test_progs -t bpf_tcp_ca
Without this patch the SYN/ACK will follow whatever the default is. If dctcp
all SYN/ACK packets will have the ECT0 bit set, and if it is not then ECT0
will be cleared on all SYN/ACK packets. With this patch applied the SYN/ACK
bit matches the value seen on the other packets in the given stream.
Fixes: 91b5b21c7c16 ("bpf: Add support for changing congestion control")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 985f7337421a811cb354ca93882f943c8335a6f5 ]
When setting sk_err, set it to ee_errno, not ee_origin.
Commit f5f99309fa74 ("sock: do not set sk_err in
sock_dequeue_err_skb") disabled updating sk_err on errq dequeue,
which is correct for most error types (origins):
- sk->sk_err = err;
Commit 38b257938ac6 ("sock: reset sk_err when the error queue is
empty") reenabled the behavior for IMCP origins, which do require it:
+ if (icmp_next)
+ sk->sk_err = SKB_EXT_ERR(skb_next)->ee.ee_origin;
But read from ee_errno.
Fixes: 38b257938ac6 ("sock: reset sk_err when the error queue is empty")
Reported-by: Ayush Ranjan <ayushranjan@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Link: https://lore.kernel.org/r/20201126151220.2819322-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 3b3fd068c56e3fbea30090859216a368398e39bf ]
rose_send_frame() dereferences `neigh->dev` when called from
rose_transmit_clear_request(), and the first occurrence of the
`neigh` is in rose_loopback_timer() as `rose_loopback_neigh`,
and it is initialized in rose_add_loopback_neigh() as NULL.
i.e when `rose_loopback_neigh` used in rose_loopback_timer()
its `->dev` was still NULL and rose_loopback_timer() was calling
rose_rx_call_request() without checking for NULL.
- net/rose/rose_link.c
This bug seems to get triggered in this line:
rose_call = (ax25_address *)neigh->dev->dev_addr;
Fix it by adding NULL checking for `rose_loopback_neigh->dev`
in rose_loopback_timer().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Reported-by: syzbot+a1c743815982d9496393@syzkaller.appspotmail.com
Tested-by: syzbot+a1c743815982d9496393@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=9d2a7ca8c7f2e4b682c97578dfa3f236258300b3
Signed-off-by: Anmol Karn <anmol.karan123@gmail.com>
Link: https://lore.kernel.org/r/20201119191043.28813-1-anmol.karan123@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 025cc2fb6a4e84e9a0552c0017dcd1c24b7ac7da ]
tls_device_offload_cleanup_rx doesn't clear tls_ctx->netdev after
calling tls_dev_del if TLX TX offload is also enabled. Clearing
tls_ctx->netdev gets postponed until tls_device_gc_task. It leaves a
time frame when tls_device_down may get called and call tls_dev_del for
RX one extra time, confusing the driver, which may lead to a crash.
This patch corrects this racy behavior by adding a flag to prevent
tls_device_down from calling tls_dev_del the second time.
Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20201125221810.69870-1-saeedm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 20ffc7adf53a5fd3d19751fbff7895bcca66686e ]
In case when tcp socket received FIN after some data and the
parser haven't started before reading data caller will receive
an empty buffer. This behavior differs from plain TCP socket and
leads to special treating in user-space.
The flow that triggers the race is simple. Server sends small
amount of data right after the connection is configured to use TLS
and closes the connection. In this case receiver sees TLS Handshake
data, configures TLS socket right after Change Cipher Spec record.
While the configuration is in process, TCP socket receives small
Application Data record, Encrypted Alert record and FIN packet. So
the TCP socket changes sk_shutdown to RCV_SHUTDOWN and sk_flag with
SK_DONE bit set. The received data is not parsed upon arrival and is
never sent to user-space.
Patch unpauses parser directly if we have unparsed data in tcp
receive queue.
Fixes: fcf4793e278e ("tls: check RCV_SHUTDOWN in tls_wait_data")
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Link: https://lore.kernel.org/r/1605801588-12236-1-git-send-email-vfedorenko@novek.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 69929d4c49e182f8526d42c43b37b460d562d3a0 ]
Currently, the openvswitch module is not accepting the correctly formated
netlink message for the TTL decrement action. For both setting and getting
the dec_ttl action, the actions should be nested in the
OVS_DEC_TTL_ATTR_ACTION attribute as mentioned in the openvswitch.h uapi.
When the original patch was sent, it was tested with a private OVS userspace
implementation. This implementation was unfortunately not upstreamed and
reviewed, hence an erroneous version of this patch was sent out.
Leaving the patch as-is would cause problems as the kernel module could
interpret additional attributes as actions and vice-versa, due to the
actions not being encapsulated/nested within the actual attribute, but
being concatinated after it.
Fixes: 744676e77720 ("openvswitch: add TTL decrement action")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://lore.kernel.org/r/160622121495.27296.888010441924340582.stgit@wsfd-netdev64.ntdv.lab.eng.bos.redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit c5dab0941fcdc9664eb0ec0d4d51433216d91336 ]
Child sockets erroneously inherit their parent's sk_type (ie. SOCK_*),
instead of the PF_IUCV protocol that the parent was created with in
iucv_sock_create().
We're currently not using sk->sk_protocol ourselves, so this shouldn't
have much impact (except eg. getting the output in skb_dump() right).
Fixes: eac3731bd04c ("[S390]: Add AF_IUCV socket support")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Link: https://lore.kernel.org/r/20201120100657.34407-1-jwi@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit e255e11e66da8281e337e4e352956e8a4999fca4 ]
kmemleak report a memory leak as follows:
unreferenced object 0xffff8880059c6a00 (size 64):
comm "ip", pid 23696, jiffies 4296590183 (age 1755.384s)
hex dump (first 32 bytes):
20 01 00 10 00 00 00 00 00 00 00 00 00 00 00 00 ...............
1c 00 00 00 00 00 00 00 00 00 00 00 07 00 00 00 ................
backtrace:
[<00000000aa4e7a87>] ip6addrlbl_add+0x90/0xbb0
[<0000000070b8d7f1>] ip6addrlbl_net_init+0x109/0x170
[<000000006a9ca9d4>] ops_init+0xa8/0x3c0
[<000000002da57bf2>] setup_net+0x2de/0x7e0
[<000000004e52d573>] copy_net_ns+0x27d/0x530
[<00000000b07ae2b4>] create_new_namespaces+0x382/0xa30
[<000000003b76d36f>] unshare_nsproxy_namespaces+0xa1/0x1d0
[<0000000030653721>] ksys_unshare+0x3a4/0x780
[<0000000007e82e40>] __x64_sys_unshare+0x2d/0x40
[<0000000031a10c08>] do_syscall_64+0x33/0x40
[<0000000099df30e7>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
We should free all rules when we catch an error in ip6addrlbl_net_init().
otherwise a memory leak will occur.
Fixes: 2a8cc6c89039 ("[IPV6] ADDRCONF: Support RFC3484 configurable address selection policy table.")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Link: https://lore.kernel.org/r/20201124071728.8385-1-wanghai38@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit a7b43649507dae4e55ff0087cad4e4dd1c6d5b99 ]
When devlink reload operation is not used, netdev of an Ethernet port may
be present in different net namespace than the net namespace of the
devlink instance.
Ensure that both the devlink instance and devlink port netdev are located
in same net namespace.
Fixes: 070c63f20f6c ("net: devlink: allow to change namespaces during reload")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit b187c9b4178b87954dbc94e78a7094715794714f ]
A netdevice of a devlink port can be moved to different net namespace
than its parent devlink instance.
This scenario occurs when devlink reload is not used.
When netdevice is undergoing migration to net namespace, its ifindex
and name may change.
In such use case, devlink port query may read stale netdev attributes.
Fix it by reading them under rtnl lock.
Fixes: bfcd3a466172 ("Introduce devlink infrastructure")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 14a2e551faea53d45bc11629a9dac88f88950ca7 ]
If THIS_MODULE is not set, the module would be removed while debugfs is
being used.
It eventually makes kernel panic.
Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit c09c8a27b9baa417864b9adc3228b10ae5eeec93 upstream.
Checking for ifdef CONFIG_x fails if CONFIG_x=m.
Use IS_ENABLED instead, which is true for both built-ins and modules.
Otherwise, a
> ip -4 route add 1.2.3.4/32 via inet6 fe80::2 dev eth1
fails with the message "Error: IPv6 support not enabled in kernel." if
CONFIG_IPV6 is `m`.
In the spirit of b8127113d01e53adba15b41aefd37b90ed83d631.
Fixes: d15662682db2 ("ipv4: Allow ipv6 gateway with ipv4 routes")
Cc: Kim Phillips <kim.phillips@arm.com>
Signed-off-by: Florian Klink <flokli@flokli.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20201115224509.2020651-1-flokli@flokli.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 7bc40aedf24d31d8bea80e1161e996ef4299fb10 upstream.
If sta_info_insert_finish() fails, we currently keep the station
around and free it only in the caller, but there's only one such
caller and it always frees it immediately.
As syzbot found, another consequence of this split is that we can
put things that sleep only into __cleanup_single_sta() and not in
sta_info_free(), but this is the only place that requires such of
sta_info_free() now.
Change this to free the station in sta_info_insert_finish(), in
which case we can still sleep. This will also let us unify the
cleanup code later.
Cc: stable@vger.kernel.org
Fixes: dcd479e10a05 ("mac80211: always wind down STA state")
Reported-by: syzbot+32c6c38c4812d22f2f0b@syzkaller.appspotmail.com
Reported-by: syzbot+4c81fe92e372d26c4246@syzkaller.appspotmail.com
Reported-by: syzbot+6a7fe9faf0d1d61bc24a@syzkaller.appspotmail.com
Reported-by: syzbot+abed06851c5ffe010921@syzkaller.appspotmail.com
Reported-by: syzbot+b7aeb9318541a1c709f1@syzkaller.appspotmail.com
Reported-by: syzbot+d5a9416c6cafe53b5dd0@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20201112112201.ee6b397b9453.I9c31d667a0ea2151441cc64ed6613d36c18a48e0@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit b2911a84396f72149dce310a3b64d8948212c1b3 upstream.
Some drivers fill the status rate list without setting the rate index after
the final rate to -1. minstrel_ht already deals with this, but minstrel
doesn't, which causes it to get stuck at the lowest rate on these drivers.
Fix this by checking the count as well.
Cc: stable@vger.kernel.org
Fixes: cccf129f820e ("mac80211: add the 'minstrel' rate control algorithm")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20201111183359.43528-3-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 4fe40b8e1566dad04c87fbf299049a1d0d4bd58d upstream.
Deferring sampling attempts to the second stage has some bad interactions
with drivers that process the rate table in hardware and use the probe flag
to indicate probing packets (e.g. most mt76 drivers). On affected drivers
it can lead to probing not working at all.
If the link conditions turn worse, it might not be such a good idea to
do a lot of sampling for lower rates in this case.
Fix this by simply skipping the sample attempt instead of deferring it,
but keep the checks that would allow it to be sampled if it was skipped
too often, but only if it has less than 95% success probability.
Also ensure that IEEE80211_TX_CTL_RATE_CTRL_PROBE is set for all probing
packets.
Cc: stable@vger.kernel.org
Fixes: cccf129f820e ("mac80211: add the 'minstrel' rate control algorithm")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20201111183359.43528-2-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|