summaryrefslogtreecommitdiffstats
path: root/net/sctp
Commit message (Collapse)AuthorAgeFilesLines
...
* | sctp: add SCTP_ADDR_POTENTIALLY_FAILED notificationXin Long2019-11-081-18/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SCTP Quick failover draft section 5.1, point 5 has been removed from rfc7829. Instead, "the sender SHOULD (i) notify the Upper Layer Protocol (ULP) about this state transition", as said in section 3.2, point 8. So this patch is to add SCTP_ADDR_POTENTIALLY_FAILED, defined in section 7.1, "which is reported if the affected address becomes PF". Also remove transport cwnd's update when moving from PF back to ACTIVE , which is no longer in rfc7829 either. Note that ulp_notify will be set to false if asoc->expose is not 'enabled', according to last patch. v2->v3: - define SCTP_ADDR_PF SCTP_ADDR_POTENTIALLY_FAILED. v3->v4: - initialize spc_state with SCTP_ADDR_AVAILABLE, as Marcelo suggested. - check asoc->pf_expose in sctp_assoc_control_transport(), as Marcelo suggested. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: add pf_expose per netns and sock and asocXin Long2019-11-084-2/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As said in rfc7829, section 3, point 12: The SCTP stack SHOULD expose the PF state of its destination addresses to the ULP as well as provide the means to notify the ULP of state transitions of its destination addresses from active to PF, and vice versa. However, it is recommended that an SCTP stack implementing SCTP-PF also allows for the ULP to be kept ignorant of the PF state of its destinations and the associated state transitions, thus allowing for retention of the simpler state transition model of [RFC4960] in the ULP. Not only does it allow to expose the PF state to ULP, but also allow to ignore sctp-pf to ULP. So this patch is to add pf_expose per netns, sock and asoc. And in sctp_assoc_control_transport(), ulp_notify will be set to false if asoc->expose is not 'enabled' in next patch. It also allows a user to change pf_expose per netns by sysctl, and pf_expose per sock and asoc will be initialized with it. Note that pf_expose also works for SCTP_GET_PEER_ADDR_INFO sockopt, to not allow a user to query the state of a sctp-pf peer address when pf_expose is 'disabled', as said in section 7.3. v1->v2: - Fix a build warning noticed by Nathan Chancellor. v2->v3: - set pf_expose to UNUSED by default to keep compatible with old applications. v3->v4: - add a new entry for pf_expose on ip-sysctl.txt, as Marcelo suggested. - change this patch to 1/5, and move sctp_assoc_control_transport change into 2/5, as Marcelo suggested. - use SCTP_PF_EXPOSE_UNSET instead of SCTP_PF_EXPOSE_UNUSED, and set SCTP_PF_EXPOSE_UNSET to 0 in enum, as Marcelo suggested. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: annotate lockless accesses to sk->sk_max_ack_backlogEric Dumazet2019-11-062-3/+3
| | | | | | | | | | | | | | | | | | | | | | sk->sk_max_ack_backlog can be read without any lock being held at least in TCP/DCCP cases. We need to use READ_ONCE()/WRITE_ONCE() to avoid load/store tearing and/or potential KCSAN warnings. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: annotate lockless accesses to sk->sk_ack_backlogEric Dumazet2019-11-061-1/+1
| | | | | | | | | | | | | | | | | | sk->sk_ack_backlog can be read without any lock being held. We need to use READ_ONCE()/WRITE_ONCE() to avoid load/store tearing and/or potential KCSAN warnings. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: use helpers to change sk_ack_backlogEric Dumazet2019-11-062-3/+3
| | | | | | | | | | | | | | | | | | | | Writers are holding a lock, but many readers do not. Following patch will add appropriate barriers in sk_acceptq_removed() and sk_acceptq_added(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2019-11-021-4/+4
|\| | | | | | | | | | | | | | | | | The only slightly tricky merge conflict was the netdevsim because the mutex locking fix overlapped a lot of driver reload reorganization. The rest were (relatively) trivial in nature. Signed-off-by: David S. Miller <davem@davemloft.net>
| * inet: stop leaking jiffies on the wireEric Dumazet2019-11-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Historically linux tried to stick to RFC 791, 1122, 2003 for IPv4 ID field generation. RFC 6864 made clear that no matter how hard we try, we can not ensure unicity of IP ID within maximum lifetime for all datagrams with a given source address/destination address/protocol tuple. Linux uses a per socket inet generator (inet_id), initialized at connection startup with a XOR of 'jiffies' and other fields that appear clear on the wire. Thiemo Nagel pointed that this strategy is a privacy concern as this provides 16 bits of entropy to fingerprint devices. Let's switch to a random starting point, this is just as good as far as RFC 6864 is concerned and does not leak anything critical. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Thiemo Nagel <tnagel@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: use skb_queue_empty_lockless() in busy poll contextsEric Dumazet2019-10-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | Busy polling usually runs without locks. Let's use skb_queue_empty_lockless() instead of skb_queue_empty() Also uses READ_ONCE() in __skb_try_recv_datagram() to address a similar potential problem. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: use skb_queue_empty_lockless() in poll() handlersEric Dumazet2019-10-281-2/+2
| | | | | | | | | | | | | | | | Many poll() handlers are lockless. Using skb_queue_empty_lockless() instead of skb_queue_empty() is more appropriate. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2019-10-204-14/+20
|\| | | | | | | | | | | | | Several cases of overlapping changes which were for the most part trivially resolvable. Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds2019-10-193-8/+14
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: "I was battling a cold after some recent trips, so quite a bit piled up meanwhile, sorry about that. Highlights: 1) Fix fd leak in various bpf selftests, from Brian Vazquez. 2) Fix crash in xsk when device doesn't support some methods, from Magnus Karlsson. 3) Fix various leaks and use-after-free in rxrpc, from David Howells. 4) Fix several SKB leaks due to confusion of who owns an SKB and who should release it in the llc code. From Eric Biggers. 5) Kill a bunc of KCSAN warnings in TCP, from Eric Dumazet. 6) Jumbo packets don't work after resume on r8169, as the BIOS resets the chip into non-jumbo mode during suspend. From Heiner Kallweit. 7) Corrupt L2 header during MPLS push, from Davide Caratti. 8) Prevent possible infinite loop in tc_ctl_action, from Eric Dumazet. 9) Get register bits right in bcmgenet driver, based upon chip version. From Florian Fainelli. 10) Fix mutex problems in microchip DSA driver, from Marek Vasut. 11) Cure race between route lookup and invalidation in ipv4, from Wei Wang. 12) Fix performance regression due to false sharing in 'net' structure, from Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (145 commits) net: reorder 'struct net' fields to avoid false sharing net: dsa: fix switch tree list net: ethernet: dwmac-sun8i: show message only when switching to promisc net: aquantia: add an error handling in aq_nic_set_multicast_list net: netem: correct the parent's backlog when corrupted packet was dropped net: netem: fix error path for corrupted GSO frames macb: propagate errors when getting optional clocks xen/netback: fix error path of xenvif_connect_data() net: hns3: fix mis-counting IRQ vector numbers issue net: usb: lan78xx: Connect PHY before registering MAC vsock/virtio: discard packets if credit is not respected vsock/virtio: send a credit update when buffer size is changed mlxsw: spectrum_trap: Push Ethernet header before reporting trap net: ensure correct skb->tstamp in various fragmenters net: bcmgenet: reset 40nm EPHY on energy detect net: bcmgenet: soft reset 40nm EPHYs before MAC init net: phy: bcm7xxx: define soft_reset for 40nm EPHY net: bcmgenet: don't set phydev->link from MAC net: Update address for MediaTek ethernet driver in MAINTAINERS ipv4: fix race condition between route lookup and invalidation ...
| | * sctp: change sctp_prot .no_autobind with trueXin Long2019-10-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzbot reported a memory leak: BUG: memory leak, unreferenced object 0xffff888120b3d380 (size 64): backtrace: [...] slab_alloc mm/slab.c:3319 [inline] [...] kmem_cache_alloc+0x13f/0x2c0 mm/slab.c:3483 [...] sctp_bucket_create net/sctp/socket.c:8523 [inline] [...] sctp_get_port_local+0x189/0x5a0 net/sctp/socket.c:8270 [...] sctp_do_bind+0xcc/0x200 net/sctp/socket.c:402 [...] sctp_bindx_add+0x4b/0xd0 net/sctp/socket.c:497 [...] sctp_setsockopt_bindx+0x156/0x1b0 net/sctp/socket.c:1022 [...] sctp_setsockopt net/sctp/socket.c:4641 [inline] [...] sctp_setsockopt+0xaea/0x2dc0 net/sctp/socket.c:4611 [...] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3147 [...] __sys_setsockopt+0x10f/0x220 net/socket.c:2084 [...] __do_sys_setsockopt net/socket.c:2100 [inline] It was caused by when sending msgs without binding a port, in the path: inet_sendmsg() -> inet_send_prepare() -> inet_autobind() -> .get_port/sctp_get_port(), sp->bind_hash will be set while bp->port is not. Later when binding another port by sctp_setsockopt_bindx(), a new bucket will be created as bp->port is not set. sctp's autobind is supposed to call sctp_autobind() where it does all things including setting bp->port. Since sctp_autobind() is called in sctp_sendmsg() if the sk is not yet bound, it should have skipped the auto bind. THis patch is to avoid calling inet_autobind() in inet_send_prepare() by changing sctp_prot .no_autobind with true, also remove the unused .get_port. Reported-by: syzbot+d44f7bbebdea49dbc84a@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: silence KCSAN warnings about sk->sk_backlog.len readsEric Dumazet2019-10-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sk->sk_backlog.len can be written by BH handlers, and read from process contexts in a lockless way. Note the write side should also use WRITE_ONCE() or a variant. We need some agreement about the best way to do this. syzbot reported : BUG: KCSAN: data-race in tcp_add_backlog / tcp_grow_window.isra.0 write to 0xffff88812665f32c of 4 bytes by interrupt on cpu 1: sk_add_backlog include/net/sock.h:934 [inline] tcp_add_backlog+0x4a0/0xcc0 net/ipv4/tcp_ipv4.c:1737 tcp_v4_rcv+0x1aba/0x1bf0 net/ipv4/tcp_ipv4.c:1925 ip_protocol_deliver_rcu+0x51/0x470 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5004 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5118 netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5208 napi_skb_finish net/core/dev.c:5671 [inline] napi_gro_receive+0x28f/0x330 net/core/dev.c:5704 receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061 virtnet_receive drivers/net/virtio_net.c:1323 [inline] virtnet_poll+0x436/0x7d0 drivers/net/virtio_net.c:1428 napi_poll net/core/dev.c:6352 [inline] net_rx_action+0x3ae/0xa50 net/core/dev.c:6418 read to 0xffff88812665f32c of 4 bytes by task 7292 on cpu 0: tcp_space include/net/tcp.h:1373 [inline] tcp_grow_window.isra.0+0x6b/0x480 net/ipv4/tcp_input.c:413 tcp_event_data_recv+0x68f/0x990 net/ipv4/tcp_input.c:717 tcp_rcv_established+0xbfe/0xf50 net/ipv4/tcp_input.c:5618 tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1542 sk_backlog_rcv include/net/sock.h:945 [inline] __release_sock+0x135/0x1e0 net/core/sock.c:2427 release_sock+0x61/0x160 net/core/sock.c:2943 tcp_recvmsg+0x63b/0x1a30 net/ipv4/tcp.c:2181 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec net/socket.c:871 [inline] sock_recvmsg net/socket.c:889 [inline] sock_recvmsg+0x92/0xb0 net/socket.c:885 sock_read_iter+0x15f/0x1e0 net/socket.c:967 call_read_iter include/linux/fs.h:1864 [inline] new_sync_read+0x389/0x4f0 fs/read_write.c:414 __vfs_read+0xb1/0xc0 fs/read_write.c:427 vfs_read fs/read_write.c:461 [inline] vfs_read+0x143/0x2c0 fs/read_write.c:446 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 7292 Comm: syz-fuzzer Not tainted 5.3.0+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
| | * net: silence KCSAN warnings around sk_add_backlog() callsEric Dumazet2019-10-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sk_add_backlog() callers usually read sk->sk_rcvbuf without owning the socket lock. This means sk_rcvbuf value can be changed by other cpus, and KCSAN complains. Add READ_ONCE() annotations to document the lockless nature of these reads. Note that writes over sk_rcvbuf should also use WRITE_ONCE(), but this will be done in separate patches to ease stable backports (if we decide this is relevant for stable trees). BUG: KCSAN: data-race in tcp_add_backlog / tcp_recvmsg write to 0xffff88812ab369f8 of 8 bytes by interrupt on cpu 1: __sk_add_backlog include/net/sock.h:902 [inline] sk_add_backlog include/net/sock.h:933 [inline] tcp_add_backlog+0x45a/0xcc0 net/ipv4/tcp_ipv4.c:1737 tcp_v4_rcv+0x1aba/0x1bf0 net/ipv4/tcp_ipv4.c:1925 ip_protocol_deliver_rcu+0x51/0x470 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5004 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5118 netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5208 napi_skb_finish net/core/dev.c:5671 [inline] napi_gro_receive+0x28f/0x330 net/core/dev.c:5704 receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061 virtnet_receive drivers/net/virtio_net.c:1323 [inline] virtnet_poll+0x436/0x7d0 drivers/net/virtio_net.c:1428 napi_poll net/core/dev.c:6352 [inline] net_rx_action+0x3ae/0xa50 net/core/dev.c:6418 read to 0xffff88812ab369f8 of 8 bytes by task 7271 on cpu 0: tcp_recvmsg+0x470/0x1a30 net/ipv4/tcp.c:2047 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec net/socket.c:871 [inline] sock_recvmsg net/socket.c:889 [inline] sock_recvmsg+0x92/0xb0 net/socket.c:885 sock_read_iter+0x15f/0x1e0 net/socket.c:967 call_read_iter include/linux/fs.h:1864 [inline] new_sync_read+0x389/0x4f0 fs/read_write.c:414 __vfs_read+0xb1/0xc0 fs/read_write.c:427 vfs_read fs/read_write.c:461 [inline] vfs_read+0x143/0x2c0 fs/read_write.c:446 ksys_read+0xd5/0x1b0 fs/read_write.c:587 __do_sys_read fs/read_write.c:597 [inline] __se_sys_read fs/read_write.c:595 [inline] __x64_sys_read+0x4c/0x60 fs/read_write.c:595 do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 7271 Comm: syz-fuzzer Not tainted 5.3.0+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
| | * sctp: add chunks to sk_backlog when the newsk sk_socket is not setXin Long2019-10-091-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is to fix a NULL-ptr deref in selinux_socket_connect_helper: [...] kasan: GPF could be caused by NULL-ptr deref or user memory access [...] RIP: 0010:selinux_socket_connect_helper+0x94/0x460 [...] Call Trace: [...] selinux_sctp_bind_connect+0x16a/0x1d0 [...] security_sctp_bind_connect+0x58/0x90 [...] sctp_process_asconf+0xa52/0xfd0 [sctp] [...] sctp_sf_do_asconf+0x785/0x980 [sctp] [...] sctp_do_sm+0x175/0x5a0 [sctp] [...] sctp_assoc_bh_rcv+0x285/0x5b0 [sctp] [...] sctp_backlog_rcv+0x482/0x910 [sctp] [...] __release_sock+0x11e/0x310 [...] release_sock+0x4f/0x180 [...] sctp_accept+0x3f9/0x5a0 [sctp] [...] inet_accept+0xe7/0x720 It was caused by that the 'newsk' sk_socket was not set before going to security sctp hook when processing asconf chunk with SCTP_PARAM_ADD_IP or SCTP_PARAM_SET_PRIMARY: inet_accept()-> sctp_accept(): lock_sock(): lock listening 'sk' do_softirq(): sctp_rcv(): <-- [1] asconf chunk arrives and enqueued in 'sk' backlog sctp_sock_migrate(): set asoc's sk to 'newsk' release_sock(): sctp_backlog_rcv(): lock 'newsk' sctp_process_asconf() <-- [2] unlock 'newsk' sock_graft(): set sk_socket <-- [3] As it shows, at [1] the asconf chunk would be put into the listening 'sk' backlog, as accept() was holding its sock lock. Then at [2] asconf would get processed with 'newsk' as asoc's sk had been set to 'newsk'. However, 'newsk' sk_socket is not set until [3], while selinux_sctp_bind_connect() would deref it, then kernel crashed. Here to fix it by adding the chunk to sk_backlog until newsk sk_socket is set when .accept() is done. Note that sk->sk_socket can be NULL when the sock is closed, so SOCK_DEAD flag is also needed to check in sctp_newsk_ready(). Thanks to Ondrej for reviewing the code. Fixes: d452930fd3b9 ("selinux: Add SCTP support") Reported-by: Ying Xu <yinxu@redhat.com> Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
| * | net: sctp: Rename fallthrough label to unhandledJoe Perches2019-10-111-6/+6
| |/ | | | | | | | | | | | | | | | | | | | | fallthrough will become a pseudo reserved keyword so this only use of fallthrough is better renamed to allow it. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | sctp: add SCTP_SEND_FAILED_EVENT eventXin Long2019-10-092-21/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is to add a new event SCTP_SEND_FAILED_EVENT described in rfc6458#section-6.1.11. It's a update of SCTP_SEND_FAILED event: struct sctp_sndrcvinfo ssf_info is replaced with struct sctp_sndinfo ssfe_info in struct sctp_send_failed_event. SCTP_SEND_FAILED is being deprecated, but we don't remove it in this patch. Both are being processed in sctp_datamsg_destroy() when the corresp event flag is set. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
* | sctp: add SCTP_ADDR_MADE_PRIM eventXin Long2019-10-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sctp_ulpevent_nofity_peer_addr_change() would be called in sctp_assoc_set_primary() to send SCTP_ADDR_MADE_PRIM event when this transport is set to the primary path of the asoc. This event is described in rfc6458#section-6.1.2: SCTP_ADDR_MADE_PRIM: This address has now been made the primary destination address. This notification is provided whenever an address is made primary. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
* | sctp: add SCTP_ADDR_REMOVED eventXin Long2019-10-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sctp_ulpevent_nofity_peer_addr_change() is called in sctp_assoc_rm_peer() to send SCTP_ADDR_REMOVED event when this transport is removed from the asoc. This event is described in rfc6458#section-6.1.2: SCTP_ADDR_REMOVED: The address is no longer part of the association. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
* | sctp: add SCTP_ADDR_ADDED eventXin Long2019-10-092-14/+23
|/ | | | | | | | | | | | | | A helper sctp_ulpevent_nofity_peer_addr_change() will be extracted to make peer_addr_change event and enqueue it, and the helper will be called in sctp_assoc_add_peer() to send SCTP_ADDR_ADDED event. This event is described in rfc6458#section-6.1.2: SCTP_ADDR_ADDED: The address is now part of the association. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
* netfilter: drop bridge nf reset from nf_resetFlorian Westphal2019-10-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | commit 174e23810cd31 ("sk_buff: drop all skb extensions on free and skb scrubbing") made napi recycle always drop skb extensions. The additional skb_ext_del() that is performed via nf_reset on napi skb recycle is not needed anymore. Most nf_reset() calls in the stack are there so queued skb won't block 'rmmod nf_conntrack' indefinitely. This removes the skb_ext_del from nf_reset, and renames it to a more fitting nf_reset_ct(). In a few selected places, add a call to skb_ext_reset to make sure that no active extensions remain. I am submitting this for "net", because we're still early in the release cycle. The patch applies to net-next too, but I think the rename causes needless divergence between those trees. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* ipv6: add priority parameter to ip6_xmit()Eric Dumazet2019-09-271-1/+1
| | | | | | | | | | | | | Currently, ip6_xmit() sets skb->priority based on sk->sk_priority This is not desirable for TCP since TCP shares the same ctl socket for a given netns. We want to be able to send RST or ACK packets with a non zero skb->priority. This patch has no functional change. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2019-09-153-13/+15
|\ | | | | | | | | | | Minor overlapping changes in the btusb and ixgbe drivers. Signed-off-by: David S. Miller <davem@davemloft.net>
| * sctp: destroy bucket if failed to bind addrMao Wenan2019-09-131-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is one memory leak bug report: BUG: memory leak unreferenced object 0xffff8881dc4c5ec0 (size 40): comm "syz-executor.0", pid 5673, jiffies 4298198457 (age 27.578s) hex dump (first 32 bytes): 02 00 00 00 81 88 ff ff 00 00 00 00 00 00 00 00 ................ f8 63 3d c1 81 88 ff ff 00 00 00 00 00 00 00 00 .c=............. backtrace: [<0000000072006339>] sctp_get_port_local+0x2a1/0xa00 [sctp] [<00000000c7b379ec>] sctp_do_bind+0x176/0x2c0 [sctp] [<000000005be274a2>] sctp_bind+0x5a/0x80 [sctp] [<00000000b66b4044>] inet6_bind+0x59/0xd0 [ipv6] [<00000000c68c7f42>] __sys_bind+0x120/0x1f0 net/socket.c:1647 [<000000004513635b>] __do_sys_bind net/socket.c:1658 [inline] [<000000004513635b>] __se_sys_bind net/socket.c:1656 [inline] [<000000004513635b>] __x64_sys_bind+0x3e/0x50 net/socket.c:1656 [<0000000061f2501e>] do_syscall_64+0x72/0x2e0 arch/x86/entry/common.c:296 [<0000000003d1e05e>] entry_SYSCALL_64_after_hwframe+0x49/0xbe This is because in sctp_do_bind, if sctp_get_port_local is to create hash bucket successfully, and sctp_add_bind_addr failed to bind address, e.g return -ENOMEM, so memory leak found, it needs to destroy allocated bucket. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Mao Wenan <maowenan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sctp: remove redundant assignment when call sctp_get_port_localMao Wenan2019-09-131-2/+1
| | | | | | | | | | | | | | | | | | | | | | There are more parentheses in if clause when call sctp_get_port_local in sctp_do_bind, and redundant assignment to 'ret'. This patch is to do cleanup. Signed-off-by: Mao Wenan <maowenan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sctp: change return type of sctp_get_port_localMao Wenan2019-09-131-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently sctp_get_port_local() returns a long which is either 0,1 or a pointer casted to long. It's neither of the callers use the return value since commit 62208f12451f ("net: sctp: simplify sctp_get_port"). Now two callers are sctp_get_port and sctp_do_bind, they actually assumend a casted to an int was the same as a pointer casted to a long, and they don't save the return value just check whether it is zero or non-zero, so it would better change return type from long to int for sctp_get_port_local. Signed-off-by: Mao Wenan <maowenan@huawei.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sctp: Fix the link time qualifier of 'sctp_ctrlsock_exit()'Christophe JAILLET2019-09-121-1/+1
| | | | | | | | | | | | | | | | | | | | The '.exit' functions from 'pernet_operations' structure should be marked as __net_exit, not __net_init. Fixes: 8e2d61e0aed2 ("sctp: fix race on protocol/netns initialization") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sctp: fix the missing put_user when dumping transport thresholdsXin Long2019-09-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This issue causes SCTP_PEER_ADDR_THLDS sockopt not to be able to dump a transport thresholds info. Fix it by adding 'goto' put_user in sctp_getsockopt_paddr_thresholds. Fixes: 8add543e369d ("sctp: add SCTP_FUTURE_ASSOC for SCTP_PEER_ADDR_THLDS sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sctp: use transport pf_retrans in sctp_do_8_2_transport_strikeXin Long2019-09-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Transport should use its own pf_retrans to do the error_count check, instead of asoc's. Otherwise, it's meaningless to make pf_retrans per transport. Fixes: 5aa93bcf66f4 ("sctp: Implement quick failover draft from tsvwg") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: allow users to set ep ecn flag by sockoptXin Long2019-08-271-0/+73
| | | | | | | | | | | | | | | | SCTP_ECN_SUPPORTED sockopt will be added to allow users to change ep ecn flag, and it's similar with other feature flags. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: allow users to set netns ecn flag with sysctlXin Long2019-08-271-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | sysctl net.sctp.ecn_enable is added in this patch. It will allow users to change the default sctp ecn flag, net.sctp.ecn_enable. This feature was also required on this thread: http://lkml.iu.edu/hypermail/linux/kernel/0812.1/01858.html Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: make ecn flag per netns and endpointXin Long2019-08-273-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | This patch is to add ecn flag for both netns_sctp and sctp_endpoint, net->sctp.ecn_enable is set 1 by default, and ep->ecn_enable will be initialized with net->sctp.ecn_enable. asoc->peer.ecn_capable will be set during negotiation only when ep->ecn_enable is set on both sides. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: remove net sctp.x_enable working as a global switchXin Long2019-08-191-16/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The netns sctp feature flags shouldn't work as a global switch, which is mostly like a firewall/netfilter's job. Also, it will break asoc as it discard or accept chunks incorrectly when net sctp.x_enable is changed after the asoc is created. Since each type of chunk's processing function will check the corresp asoc's feature flag, this 'global switch' should be removed, and net sctp.x_enable will only work as the default feature flags for the future sctp sockets/endpoints. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: add SCTP_AUTH_SUPPORTED sockoptXin Long2019-08-191-0/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SCTP_AUTH_SUPPORTED sockopt is used to set enpoint's auth flag. With this feature, each endpoint will have its own flag for its future asoc's auth_capable, instead of netns auth flag. Note that when both ep's auth_enable is enabled, endpoint auth related data should be initialized. If asconf_enable is also set, SCTP_CID_ASCONF/SCTP_CID_ASCONF_ACK should be added into auth_chunk_list. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: add sctp_auth_init and sctp_auth_freeXin Long2019-08-192-56/+74
| | | | | | | | | | | | | | | | | | This patch is to factor out sctp_auth_init and sctp_auth_free functions, and sctp_auth_init will also be used in the next patch for SCTP_AUTH_SUPPORTED sockopt. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: use ep and asoc auth_enable properlyXin Long2019-08-192-33/+44
| | | | | | | | | | | | | | | | | | sctp has per endpoint auth flag and per asoc auth flag, and the asoc one should be checked when coming to asoc and the endpoint one should be checked when coming to endpoint. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: add SCTP_ASCONF_SUPPORTED sockoptXin Long2019-08-191-0/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | SCTP_ASCONF_SUPPORTED sockopt is used to set enpoint's asconf flag. With this feature, each endpoint will have its own flag for its future asoc's asconf_capable, instead of netns asconf flag. Note that when both ep's asconf_enable and auth_enable are enabled, SCTP_CID_ASCONF and SCTP_CID_ASCONF_ACK should be added into auth_chunk_list. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: check asoc peer.asconf_capable before processing asconfXin Long2019-08-191-2/+4
| | | | | | | | | | | | | | | | asconf chunks should be dropped when the asoc doesn't support asconf feature. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: not set peer.asconf_capable in sctp_association_initXin Long2019-08-191-9/+0
| | | | | | | | | | | | | | | | | | | | asoc->peer.asconf_capable is to be set during handshake, and its value should be initialized to 0. net->sctp.addip_noauth will be checked in sctp_process_init when processing INIT_ACK on client and COOKIE_ECHO on server. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: add asconf_enable in struct sctp_endpointXin Long2019-08-193-20/+18
| | | | | | | | | | | | | | | | | | | | | | | | This patch is to make addip/asconf flag per endpoint, and its value is initialized by the per netns flag, net->sctp.addip_enable. It also replaces the checks of net->sctp.addip_enable with ep->asconf_enable in some places. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2019-08-192-1/+2
|\| | | | | | | | | | | | | Merge conflict of mlx5 resolved using instructions in merge commit 9566e650bf7fdf58384bb06df634f7531ca3a97e. Signed-off-by: David S. Miller <davem@davemloft.net>
| * sctp: fix memleak in sctp_send_reset_streamszhengbin2019-08-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | If the stream outq is not empty, need to kfree nstr_list. Fixes: d570a59c5b5f ("sctp: only allow the out stream reset when the stream outq is empty") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
| * sctp: fix the transport error_count checkXin Long2019-08-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the annotation says in sctp_do_8_2_transport_strike(): "If the transport error count is greater than the pf_retrans threshold, and less than pathmaxrtx ..." It should be transport->error_count checked with pathmaxrxt, instead of asoc->pf_retrans. Fixes: 5aa93bcf66f4 ("sctp: Implement quick failover draft from tsvwg") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2019-08-061-2/+2
|\| | | | | | | | | | | Just minor overlapping changes in the conflicts here. Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: sctp: drop unneeded likely() call around IS_ERR()Enrico Weigelt2019-07-291-2/+2
| | | | | | | | | | | | | | | | | | IS_ERR() already calls unlikely(), so this extra unlikely() call around IS_ERR() is not needed. Signed-off-by: Enrico Weigelt <info@metux.net> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: factor out sctp_connect_add_peerXin Long2019-07-301-45/+31
| | | | | | | | | | | | | | | | | | In this function factored out from sctp_sendmsg_new_asoc() and __sctp_connect(), it adds a peer with the other addr into the asoc after this asoc is created with the 1st addr. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: factor out sctp_connect_new_asocXin Long2019-07-301-84/+76
| | | | | | | | | | | | | | | | | | In this function factored out from sctp_sendmsg_new_asoc() and __sctp_connect(), it creates the asoc and adds a peer with the 1st addr. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: clean up __sctp_connectXin Long2019-07-301-136/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | __sctp_connect is doing quit similar things as sctp_sendmsg_new_asoc. To factor out common functions, this patch is to clean up their code to make them look more similar: 1. create the asoc and add a peer with the 1st addr. 2. add peers with the other addrs into this asoc one by one. while at it, also remove the unused 'addrcnt'. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: check addr_size with sa_family_t size in __sctp_setsockopt_connectxXin Long2019-07-301-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now __sctp_connect() is called by __sctp_setsockopt_connectx() and sctp_inet_connect(), the latter has done addr_size check with size of sa_family_t. In the next patch to clean up __sctp_connect(), we will remove addr_size check with size of sa_family_t from __sctp_connect() for the 1st address. So before doing that, __sctp_setsockopt_connectx() should do this check first, as sctp_inet_connect() does. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sctp: only copy the available addr data in sctp_transport_initXin Long2019-07-301-1/+1
|/ | | | | | | | | | | | | | | | | | | | 'addr' passed to sctp_transport_init is not always a whole size of union sctp_addr, like the path: sctp_sendmsg() -> sctp_sendmsg_new_asoc() -> sctp_assoc_add_peer() -> sctp_transport_new() -> sctp_transport_init() In the next patches, we will also pass the address length of data only to sctp_assoc_add_peer(). So sctp_transport_init() should copy the only available data from addr to peer->ipaddr, instead of 'peer->ipaddr = *addr' which may cause slab-out-of-bounds. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>