summaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAgeFilesLines
* Bluetooth: Fix UAF in hci_conn_hash_flush againRuihan Li2023-06-141-11/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a2ac591cb4d83e1f2d4b4adb3c14b2c79764650a upstream. Commit 06149746e720 ("Bluetooth: hci_conn: Add support for linking multiple hcon") reintroduced a previously fixed bug [1] ("KASAN: slab-use-after-free Read in hci_conn_hash_flush"). This bug was originally fixed by commit 5dc7d23e167e ("Bluetooth: hci_conn: Fix possible UAF"). The hci_conn_unlink function was added to avoid invalidating the link traversal caused by successive hci_conn_del operations releasing extra connections. However, currently hci_conn_unlink itself also releases extra connections, resulted in the reintroduced bug. This patch follows a more robust solution for cleaning up all connections, by repeatedly removing the first connection until there are none left. This approach does not rely on the inner workings of hci_conn_del and ensures proper cleanup of all connections. Meanwhile, we need to make sure that hci_conn_del never fails. Indeed it doesn't, as it now always returns zero. To make this a bit clearer, this patch also changes its return type to void. Reported-by: syzbot+8bb72f86fc823817bc5d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-bluetooth/000000000000aa920505f60d25ad@google.com/ Fixes: 06149746e720 ("Bluetooth: hci_conn: Add support for linking multiple hcon") Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn> Co-developed-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Bluetooth: Refcnt drop must be placed last in hci_conn_unlinkRuihan Li2023-06-141-3/+3
| | | | | | | | | | | | | | | | | | | | commit 2910431ab0e500dfc5df12299bb15eef0f30b43e upstream. If hci_conn_put(conn->parent) reduces conn->parent's reference count to zero, it can immediately deallocate conn->parent. At the same time, conn->link->list has its head in conn->parent, causing use-after-free problems in the latter list_del_rcu(&conn->link->list). This problem can be easily solved by reordering the two operations, i.e., first performing the list removal with list_del_rcu and then decreasing the refcnt with hci_conn_put. Reported-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Closes: https://lore.kernel.org/linux-bluetooth/CABBYNZ+1kce8_RJrLNOXd_8=Mdpb=2bx4Nto-hFORk=qiOkoCg@mail.gmail.com/ Fixes: 06149746e720 ("Bluetooth: hci_conn: Add support for linking multiple hcon") Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Bluetooth: Fix potential double free caused by hci_conn_unlinkRuihan Li2023-06-141-9/+12
| | | | | | | | | | | | | | | | | | | | | | | commit ca1fd42e7dbfcb34890ffbf1f2f4b356776dab6f upstream. The hci_conn_unlink function is being called by hci_conn_del, which means it should not call hci_conn_del with the input parameter conn again. If it does, conn may have already been released when hci_conn_unlink returns, leading to potential UAF and double-free issues. This patch resolves the problem by modifying hci_conn_unlink to release only conn's child links when necessary, but never release conn itself. Reported-by: syzbot+690b90b14f14f43f4688@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-bluetooth/000000000000484a8205faafe216@google.com/ Fixes: 06149746e720 ("Bluetooth: hci_conn: Add support for linking multiple hcon") Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Reported-by: syzbot+690b90b14f14f43f4688@syzkaller.appspotmail.com Reported-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Reported-by: syzbot+8bb72f86fc823817bc5d@syzkaller.appspotmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Bluetooth: fix debugfs registrationJohan Hovold2023-06-141-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit fe2ccc6c29d53e14d3c8b3ddf8ad965a92e074ee upstream. Since commit ec6cef9cd98d ("Bluetooth: Fix SMP channel registration for unconfigured controllers") the debugfs interface for unconfigured controllers will be created when the controller is configured. There is however currently nothing preventing a controller from being configured multiple time (e.g. setting the device address using btmgmt) which results in failed attempts to register the already registered debugfs entries: debugfs: File 'features' in directory 'hci0' already present! debugfs: File 'manufacturer' in directory 'hci0' already present! debugfs: File 'hci_version' in directory 'hci0' already present! ... debugfs: File 'quirk_simultaneous_discovery' in directory 'hci0' already present! Add a controller flag to avoid trying to register the debugfs interface more than once. Fixes: ec6cef9cd98d ("Bluetooth: Fix SMP channel registration for unconfigured controllers") Cc: stable@vger.kernel.org # 4.0 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Bluetooth: Fix use-after-free in hci_remove_ltk/hci_remove_irkLuiz Augusto von Dentz2023-06-141-4/+4
| | | | | | | | | | | | | commit c5d2b6fa26b5b8386a9cc902cdece3a46bef2bd2 upstream. Similar to commit 0f7d9b31ce7a ("netfilter: nf_tables: fix use-after-free in nft_set_catchall_destroy()"). We can not access k after kfree_rcu() call. Cc: stable@vger.kernel.org Signed-off-by: Min Li <lm0963hack@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* mptcp: update userspace pm infosGeliang Tang2023-06-142-4/+24
| | | | | | | | | | | | | | | | | | | | | | | | commit 77e4b94a3de692a09b79945ecac5b8e6b77f10c1 upstream. Increase pm subflows counter on both server side and client side when userspace pm creates a new subflow, and decrease the counter when it closes a subflow. Increase add_addr_signaled counter in mptcp_nl_cmd_announce() when the address is announced by userspace PM. This modification is similar to how the in-kernel PM is updating the counter: when additional subflows are created/removed. Fixes: 9ab4807c84a4 ("mptcp: netlink: Add MPTCP_PM_CMD_ANNOUNCE") Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/329 Cc: stable@vger.kernel.org Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* mptcp: add address into userspace pm listGeliang Tang2023-06-141-0/+41
| | | | | | | | | | | | | | | | | | | | | commit 24430f8bf51655c5ab7ddc2fafe939dd3cd0dd47 upstream. Add the address into userspace_pm_local_addr_list when the subflow is created. Make sure it can be found in mptcp_nl_cmd_remove(). And delete it in the new helper mptcp_userspace_pm_delete_local_addr(). By doing this, the "REMOVE" command also works with subflows that have been created via the "SUB_CREATE" command instead of restricting to the addresses that have been announced via the "ANNOUNCE" command. Fixes: d9a4594edabf ("mptcp: netlink: Add MPTCP_PM_CMD_REMOVE") Link: https://github.com/multipath-tcp/mptcp_net-next/issues/379 Cc: stable@vger.kernel.org Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* mptcp: only send RM_ADDR in nl_cmd_removeGeliang Tang2023-06-143-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | commit 8b1c94da1e481090f24127b2c420b0c0b0421ce3 upstream. The specifications from [1] about the "REMOVE" command say: Announce that an address has been lost to the peer It was then only supposed to send a RM_ADDR and not trying to delete associated subflows. A new helper mptcp_pm_remove_addrs() is then introduced to do just that, compared to mptcp_pm_remove_addrs_and_subflows() also removing subflows. To delete a subflow, the userspace daemon can use the "SUB_DESTROY" command, see mptcp_nl_cmd_sf_destroy(). Fixes: d9a4594edabf ("mptcp: netlink: Add MPTCP_PM_CMD_REMOVE") Link: https://github.com/multipath-tcp/mptcp/blob/mptcp_v0.96/include/uapi/linux/mptcp.h [1] Cc: stable@vger.kernel.org Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* can: j1939: avoid possible use-after-free when j1939_can_rx_register failsFedor Pchelkin2023-06-141-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 9f16eb106aa5fce15904625661312623ec783ed3 upstream. Syzkaller reports the following failure: BUG: KASAN: use-after-free in kref_put include/linux/kref.h:64 [inline] BUG: KASAN: use-after-free in j1939_priv_put+0x25/0xa0 net/can/j1939/main.c:172 Write of size 4 at addr ffff888141c15058 by task swapper/3/0 CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.10.144-syzkaller #0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x107/0x167 lib/dump_stack.c:118 print_address_description.constprop.0+0x1c/0x220 mm/kasan/report.c:385 __kasan_report mm/kasan/report.c:545 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562 check_memory_region_inline mm/kasan/generic.c:186 [inline] check_memory_region+0x145/0x190 mm/kasan/generic.c:192 instrument_atomic_read_write include/linux/instrumented.h:101 [inline] atomic_fetch_sub_release include/asm-generic/atomic-instrumented.h:220 [inline] __refcount_sub_and_test include/linux/refcount.h:272 [inline] __refcount_dec_and_test include/linux/refcount.h:315 [inline] refcount_dec_and_test include/linux/refcount.h:333 [inline] kref_put include/linux/kref.h:64 [inline] j1939_priv_put+0x25/0xa0 net/can/j1939/main.c:172 j1939_sk_sock_destruct+0x44/0x90 net/can/j1939/socket.c:374 __sk_destruct+0x4e/0x820 net/core/sock.c:1784 rcu_do_batch kernel/rcu/tree.c:2485 [inline] rcu_core+0xb35/0x1a30 kernel/rcu/tree.c:2726 __do_softirq+0x289/0x9a3 kernel/softirq.c:298 asm_call_irq_on_stack+0x12/0x20 </IRQ> __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline] run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline] do_softirq_own_stack+0xaa/0xe0 arch/x86/kernel/irq_64.c:77 invoke_softirq kernel/softirq.c:393 [inline] __irq_exit_rcu kernel/softirq.c:423 [inline] irq_exit_rcu+0x136/0x200 kernel/softirq.c:435 sysvec_apic_timer_interrupt+0x4d/0x100 arch/x86/kernel/apic/apic.c:1095 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:635 Allocated by task 1141: kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48 kasan_set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc.constprop.0+0xc9/0xd0 mm/kasan/common.c:461 kmalloc include/linux/slab.h:552 [inline] kzalloc include/linux/slab.h:664 [inline] j1939_priv_create net/can/j1939/main.c:131 [inline] j1939_netdev_start+0x111/0x860 net/can/j1939/main.c:268 j1939_sk_bind+0x8ea/0xd30 net/can/j1939/socket.c:485 __sys_bind+0x1f2/0x260 net/socket.c:1645 __do_sys_bind net/socket.c:1656 [inline] __se_sys_bind net/socket.c:1654 [inline] __x64_sys_bind+0x6f/0xb0 net/socket.c:1654 do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x61/0xc6 Freed by task 1141: kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48 kasan_set_track+0x1c/0x30 mm/kasan/common.c:56 kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:355 __kasan_slab_free+0x112/0x170 mm/kasan/common.c:422 slab_free_hook mm/slub.c:1542 [inline] slab_free_freelist_hook+0xad/0x190 mm/slub.c:1576 slab_free mm/slub.c:3149 [inline] kfree+0xd9/0x3b0 mm/slub.c:4125 j1939_netdev_start+0x5ee/0x860 net/can/j1939/main.c:300 j1939_sk_bind+0x8ea/0xd30 net/can/j1939/socket.c:485 __sys_bind+0x1f2/0x260 net/socket.c:1645 __do_sys_bind net/socket.c:1656 [inline] __se_sys_bind net/socket.c:1654 [inline] __x64_sys_bind+0x6f/0xb0 net/socket.c:1654 do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x61/0xc6 It can be caused by this scenario: CPU0 CPU1 j1939_sk_bind(socket0, ndev0, ...) j1939_netdev_start() j1939_sk_bind(socket1, ndev0, ...) j1939_netdev_start() mutex_lock(&j1939_netdev_lock) j1939_priv_set(ndev0, priv) mutex_unlock(&j1939_netdev_lock) if (priv_new) kref_get(&priv_new->rx_kref) return priv_new; /* inside j1939_sk_bind() */ jsk->priv = priv j1939_can_rx_register(priv) // fails j1939_priv_set(ndev, NULL) kfree(priv) j1939_sk_sock_destruct() j1939_priv_put() // <- uaf To avoid this, call j1939_can_rx_register() under j1939_netdev_lock so that a concurrent thread cannot process j1939_priv before j1939_can_rx_register() returns. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20230526171910.227615-3-pchelkin@ispras.ru Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* can: j1939: change j1939_netdev_lock type to mutexFedor Pchelkin2023-06-141-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit cd9c790de2088b0d797dc4d244b4f174f9962554 upstream. It turns out access to j1939_can_rx_register() needs to be serialized, otherwise j1939_priv can be corrupted when parallel threads call j1939_netdev_start() and j1939_can_rx_register() fails. This issue is thoroughly covered in other commit which serializes access to j1939_can_rx_register(). Change j1939_netdev_lock type to mutex so that we do not need to remove GFP_KERNEL from can_rx_register(). j1939_netdev_lock seems to be used in normal contexts where mutex usage is not prohibited. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Suggested-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20230526171910.227615-2-pchelkin@ispras.ru Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* can: j1939: j1939_sk_send_loop_abort(): improved error queue handling in ↵Oleksij Rempel2023-06-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | J1939 Socket commit 2a84aea80e925ecba6349090559754f8e8eb68ef upstream. This patch addresses an issue within the j1939_sk_send_loop_abort() function in the j1939/socket.c file, specifically in the context of Transport Protocol (TP) sessions. Without this patch, when a TP session is initiated and a Clear To Send (CTS) frame is received from the remote side requesting one data packet, the kernel dispatches the first Data Transport (DT) frame and then waits for the next CTS. If the remote side doesn't respond with another CTS, the kernel aborts due to a timeout. This leads to the user-space receiving an EPOLLERR on the socket, and the socket becomes active. However, when trying to read the error queue from the socket with sock.recvmsg(, , socket.MSG_ERRQUEUE), it returns -EAGAIN, given that the socket is non-blocking. This situation results in an infinite loop: the user-space repeatedly calls epoll(), epoll() returns the socket file descriptor with EPOLLERR, but the socket then blocks on the recv() of ERRQUEUE. This patch introduces an additional check for the J1939_SOCK_ERRQUEUE flag within the j1939_sk_send_loop_abort() function. If the flag is set, it indicates that the application has subscribed to receive error queue messages. In such cases, the kernel can communicate the current transfer state via the error queue. This allows for the function to return early, preventing the unnecessary setting of the socket into an error state, and breaking the infinite loop. It is crucial to note that a socket error is only needed if the application isn't using the error queue, as, without it, the application wouldn't be aware of transfer issues. Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Reported-by: David Jander <david@protonic.nl> Tested-by: David Jander <david@protonic.nl> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20230526081946.715190-1-o.rempel@pengutronix.de Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* batman-adv: Broken sync while rescheduling delayed workVladislav Efanov2023-06-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit abac3ac97fe8734b620e7322a116450d7f90aa43 upstream. Syzkaller got a lot of crashes like: KASAN: use-after-free Write in *_timers* All of these crashes point to the same memory area: The buggy address belongs to the object at ffff88801f870000 which belongs to the cache kmalloc-8k of size 8192 The buggy address is located 5320 bytes inside of 8192-byte region [ffff88801f870000, ffff88801f872000) This area belongs to : batadv_priv->batadv_priv_dat->delayed_work->timer_list The reason for these issues is the lack of synchronization. Delayed work (batadv_dat_purge) schedules new timer/work while the device is being deleted. As the result new timer/delayed work is set after cancel_delayed_work_sync() was called. So after the device is freed the timer list contains pointer to already freed memory. Found by Linux Verification Center (linuxtesting.org) with syzkaller. Cc: stable@kernel.org Fixes: 2f1dfbe18507 ("batman-adv: Distributed ARP Table - implement local storage") Signed-off-by: Vladislav Efanov <VEfanov@ispras.ru> Acked-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* net: sched: fix possible refcount leak in tc_chain_tmplt_add()Hangyu Hua2023-06-141-0/+1
| | | | | | | | | | | | | | [ Upstream commit 44f8baaf230c655c249467ca415b570deca8df77 ] try_module_get will be called in tcf_proto_lookup_ops. So module_put needs to be called to drop the refcount if ops don't implement the required function. Fixes: 9f407f1768d3 ("net: sched: introduce chain templates") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
* net: sched: act_police: fix sparse errors in tcf_police_dump()Eric Dumazet2023-06-141-5/+5
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 682881ee45c81daa883dcd4fe613b0b0d988bb22 ] Fixes following sparse errors: net/sched/act_police.c:360:28: warning: dereference of noderef expression net/sched/act_police.c:362:45: warning: dereference of noderef expression net/sched/act_police.c:362:45: warning: dereference of noderef expression net/sched/act_police.c:368:28: warning: dereference of noderef expression net/sched/act_police.c:370:45: warning: dereference of noderef expression net/sched/act_police.c:370:45: warning: dereference of noderef expression net/sched/act_police.c:376:45: warning: dereference of noderef expression net/sched/act_police.c:376:45: warning: dereference of noderef expression Fixes: d1967e495a8d ("net_sched: act_police: add 2 new attributes to support police 64bit rate and peakrate") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
* net: openvswitch: fix upcall counter access before allocationEelco Chaudron2023-06-142-21/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit de9df6c6b27e22d7bdd20107947ef3a20e687de5 ] Currently, the per cpu upcall counters are allocated after the vport is created and inserted into the system. This could lead to the datapath accessing the counters before they are allocated resulting in a kernel Oops. Here is an example: PID: 59693 TASK: ffff0005f4f51500 CPU: 0 COMMAND: "ovs-vswitchd" #0 [ffff80000a39b5b0] __switch_to at ffffb70f0629f2f4 #1 [ffff80000a39b5d0] __schedule at ffffb70f0629f5cc #2 [ffff80000a39b650] preempt_schedule_common at ffffb70f0629fa60 #3 [ffff80000a39b670] dynamic_might_resched at ffffb70f0629fb58 #4 [ffff80000a39b680] mutex_lock_killable at ffffb70f062a1388 #5 [ffff80000a39b6a0] pcpu_alloc at ffffb70f0594460c #6 [ffff80000a39b750] __alloc_percpu_gfp at ffffb70f05944e68 #7 [ffff80000a39b760] ovs_vport_cmd_new at ffffb70ee6961b90 [openvswitch] ... PID: 58682 TASK: ffff0005b2f0bf00 CPU: 0 COMMAND: "kworker/0:3" #0 [ffff80000a5d2f40] machine_kexec at ffffb70f056a0758 #1 [ffff80000a5d2f70] __crash_kexec at ffffb70f057e2994 #2 [ffff80000a5d3100] crash_kexec at ffffb70f057e2ad8 #3 [ffff80000a5d3120] die at ffffb70f0628234c #4 [ffff80000a5d31e0] die_kernel_fault at ffffb70f062828a8 #5 [ffff80000a5d3210] __do_kernel_fault at ffffb70f056a31f4 #6 [ffff80000a5d3240] do_bad_area at ffffb70f056a32a4 #7 [ffff80000a5d3260] do_translation_fault at ffffb70f062a9710 #8 [ffff80000a5d3270] do_mem_abort at ffffb70f056a2f74 #9 [ffff80000a5d32a0] el1_abort at ffffb70f06297dac #10 [ffff80000a5d32d0] el1h_64_sync_handler at ffffb70f06299b24 #11 [ffff80000a5d3410] el1h_64_sync at ffffb70f056812dc #12 [ffff80000a5d3430] ovs_dp_upcall at ffffb70ee6963c84 [openvswitch] #13 [ffff80000a5d3470] ovs_dp_process_packet at ffffb70ee6963fdc [openvswitch] #14 [ffff80000a5d34f0] ovs_vport_receive at ffffb70ee6972c78 [openvswitch] #15 [ffff80000a5d36f0] netdev_port_receive at ffffb70ee6973948 [openvswitch] #16 [ffff80000a5d3720] netdev_frame_hook at ffffb70ee6973a28 [openvswitch] #17 [ffff80000a5d3730] __netif_receive_skb_core.constprop.0 at ffffb70f06079f90 We moved the per cpu upcall counter allocation to the existing vport alloc and free functions to solve this. Fixes: 95637d91fefd ("net: openvswitch: release vport resources on failure") Fixes: 1933ea365aa7 ("net: openvswitch: Add support to count upcall packets") Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
* net: sched: move rtm_tca_policy declaration to include fileEric Dumazet2023-06-141-2/+0
| | | | | | | | | | | | | | | | [ Upstream commit 886bc7d6ed3357975c5f1d3c784da96000d4bbb4 ] rtm_tca_policy is used from net/sched/sch_api.c and net/sched/cls_api.c, thus should be declared in an include file. This fixes the following sparse warning: net/sched/sch_api.c:1434:25: warning: symbol 'rtm_tca_policy' was not declared. Should it be static? Fixes: e331473fee3d ("net/sched: cls_api: add missing validation of netlink attributes") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
* net: sched: add rcu annotations around qdisc->qdisc_sleepingEric Dumazet2023-06-1411-41/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit d636fc5dd692c8f4e00ae6e0359c0eceeb5d9bdb ] syzbot reported a race around qdisc->qdisc_sleeping [1] It is time we add proper annotations to reads and writes to/from qdisc->qdisc_sleeping. [1] BUG: KCSAN: data-race in dev_graft_qdisc / qdisc_lookup_rcu read to 0xffff8881286fc618 of 8 bytes by task 6928 on cpu 1: qdisc_lookup_rcu+0x192/0x2c0 net/sched/sch_api.c:331 __tcf_qdisc_find+0x74/0x3c0 net/sched/cls_api.c:1174 tc_get_tfilter+0x18f/0x990 net/sched/cls_api.c:2547 rtnetlink_rcv_msg+0x7af/0x8c0 net/core/rtnetlink.c:6386 netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2546 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6413 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0x375/0x4c0 net/socket.c:2503 ___sys_sendmsg net/socket.c:2557 [inline] __sys_sendmsg+0x1e3/0x270 net/socket.c:2586 __do_sys_sendmsg net/socket.c:2595 [inline] __se_sys_sendmsg net/socket.c:2593 [inline] __x64_sys_sendmsg+0x46/0x50 net/socket.c:2593 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd write to 0xffff8881286fc618 of 8 bytes by task 6912 on cpu 0: dev_graft_qdisc+0x4f/0x80 net/sched/sch_generic.c:1115 qdisc_graft+0x7d0/0xb60 net/sched/sch_api.c:1103 tc_modify_qdisc+0x712/0xf10 net/sched/sch_api.c:1693 rtnetlink_rcv_msg+0x807/0x8c0 net/core/rtnetlink.c:6395 netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2546 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6413 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0x375/0x4c0 net/socket.c:2503 ___sys_sendmsg net/socket.c:2557 [inline] __sys_sendmsg+0x1e3/0x270 net/socket.c:2586 __do_sys_sendmsg net/socket.c:2595 [inline] __se_sys_sendmsg net/socket.c:2593 [inline] __x64_sys_sendmsg+0x46/0x50 net/socket.c:2593 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 6912 Comm: syz-executor.5 Not tainted 6.4.0-rc3-syzkaller-00190-g0d85b27b0cc6 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/16/2023 Fixes: 3a7d0d07a386 ("net: sched: extend Qdisc with rcu") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vlad Buslov <vladbu@nvidia.com> Acked-by: Jamal Hadi Salim<jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
* rfs: annotate lockless accesses to RFS sock flow tableEric Dumazet2023-06-141-2/+4
| | | | | | | | | | | | | | | | | | | | [ Upstream commit 5c3b74a92aa285a3df722bf6329ba7ccf70346d6 ] Add READ_ONCE()/WRITE_ONCE() on accesses to the sock flow table. This also prevents a (smart ?) compiler to remove the condition in: if (table->ents[index] != newval) table->ents[index] = newval; We need the condition to avoid dirtying a shared cache line. Fixes: fec5e652e58f ("rfs: Receive Flow Steering") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
* tcp: gso: really support BIG TCPEric Dumazet2023-06-141-10/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 82a01ab35bd02ba4b0b4e12bc95c5b69240eb7b0 ] We missed that tcp_gso_segment() was assuming skb->len was smaller than 65535 : oldlen = (u16)~skb->len; This part came with commit 0718bcc09b35 ("[NET]: Fix CHECKSUM_HW GSO problems.") This leads to wrong TCP checksum. Adapt the code to accept arbitrary packet length. v2: - use two csum_add() instead of csum_fold() (Alexander Duyck) - Change delta type to __wsum to reduce casts (Alexander Duyck) Fixes: 09f3d1a3a52c ("ipv6/gso: remove temporary HBH/jumbo header") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230605161647.3624428-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* ipv6: rpl: Fix Route of Death.Kuniyuki Iwashima2023-06-141-18/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit a2f4c143d76b1a47c91ef9bc46907116b111da0b ] A remote DoS vulnerability of RPL Source Routing is assigned CVE-2023-2156. The Source Routing Header (SRH) has the following format: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Next Header | Hdr Ext Len | Routing Type | Segments Left | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CmprI | CmprE | Pad | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | . . . Addresses[1..n] . . . | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The originator of an SRH places the first hop's IPv6 address in the IPv6 header's IPv6 Destination Address and the second hop's IPv6 address as the first address in Addresses[1..n]. The CmprI and CmprE fields indicate the number of prefix octets that are shared with the IPv6 Destination Address. When CmprI or CmprE is not 0, Addresses[1..n] are compressed as follows: 1..n-1 : (16 - CmprI) bytes n : (16 - CmprE) bytes Segments Left indicates the number of route segments remaining. When the value is not zero, the SRH is forwarded to the next hop. Its address is extracted from Addresses[n - Segment Left + 1] and swapped with IPv6 Destination Address. When Segment Left is greater than or equal to 2, the size of SRH is not changed because Addresses[1..n-1] are decompressed and recompressed with CmprI. OTOH, when Segment Left changes from 1 to 0, the new SRH could have a different size because Addresses[1..n-1] are decompressed with CmprI and recompressed with CmprE. Let's say CmprI is 15 and CmprE is 0. When we receive SRH with Segment Left >= 2, Addresses[1..n-1] have 1 byte for each, and Addresses[n] has 16 bytes. When Segment Left is 1, Addresses[1..n-1] is decompressed to 16 bytes and not recompressed. Finally, the new SRH will need more room in the header, and the size is (16 - 1) * (n - 1) bytes. Here the max value of n is 255 as Segment Left is u8, so in the worst case, we have to allocate 3825 bytes in the skb headroom. However, now we only allocate a small fixed buffer that is IPV6_RPL_SRH_WORST_SWAP_SIZE (16 + 7 bytes). If the decompressed size overflows the room, skb_push() hits BUG() below [0]. Instead of allocating the fixed buffer for every packet, let's allocate enough headroom only when we receive SRH with Segment Left 1. [0]: skbuff: skb_under_panic: text:ffffffff81c9f6e2 len:576 put:576 head:ffff8880070b5180 data:ffff8880070b4fb0 tail:0x70 end:0x140 dev:lo kernel BUG at net/core/skbuff.c:200! invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 154 Comm: python3 Not tainted 6.4.0-rc4-00190-gc308e9ec0047 #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:skb_panic (net/core/skbuff.c:200) Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00 00 50 ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 80 6e 77 82 e8 ad 8b 60 ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 RSP: 0018:ffffc90000003da0 EFLAGS: 00000246 RAX: 0000000000000085 RBX: ffff8880058a6600 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff88807dc1c540 RDI: ffff88807dc1c540 RBP: ffffc90000003e48 R08: ffffffff82b392c8 R09: 00000000ffffdfff R10: ffffffff82a592e0 R11: ffffffff82b092e0 R12: ffff888005b1c800 R13: ffff8880070b51b8 R14: ffff888005b1ca18 R15: ffff8880070b5190 FS: 00007f4539f0b740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055670baf3000 CR3: 0000000005b0e000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <IRQ> skb_push (net/core/skbuff.c:210) ipv6_rthdr_rcv (./include/linux/skbuff.h:2880 net/ipv6/exthdrs.c:634 net/ipv6/exthdrs.c:718) ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:437 (discriminator 5)) ip6_input_finish (./include/linux/rcupdate.h:805 net/ipv6/ip6_input.c:483) __netif_receive_skb_one_core (net/core/dev.c:5494) process_backlog (./include/linux/rcupdate.h:805 net/core/dev.c:5934) __napi_poll (net/core/dev.c:6496) net_rx_action (net/core/dev.c:6565 net/core/dev.c:6696) __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:572) do_softirq (kernel/softirq.c:472 kernel/softirq.c:459) </IRQ> <TASK> __local_bh_enable_ip (kernel/softirq.c:396) __dev_queue_xmit (net/core/dev.c:4272) ip6_finish_output2 (./include/net/neighbour.h:544 net/ipv6/ip6_output.c:134) rawv6_sendmsg (./include/net/dst.h:458 ./include/linux/netfilter.h:303 net/ipv6/raw.c:656 net/ipv6/raw.c:914) sock_sendmsg (net/socket.c:724 net/socket.c:747) __sys_sendto (net/socket.c:2144) __x64_sys_sendto (net/socket.c:2156 net/socket.c:2152 net/socket.c:2152) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) RIP: 0033:0x7f453a138aea Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 RSP: 002b:00007ffcc212a1c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007ffcc212a288 RCX: 00007f453a138aea RDX: 0000000000000060 RSI: 00007f4539084c20 RDI: 0000000000000003 RBP: 00007f4538308e80 R08: 00007ffcc212a300 R09: 000000000000001c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007f4539712d1b </TASK> Modules linked in: Fixes: 8610c7c6e3bd ("net: ipv6: add support for rpl sr exthdr") Reported-by: Max VA Closes: https://www.interruptlabs.co.uk/articles/linux-ipv6-route-of-death Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230605180617.67284-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* netfilter: nf_tables: out-of-bound check in chain blobPablo Neira Ayuso2023-06-141-1/+1
| | | | | | | | | | [ Upstream commit 08e42a0d3ad30f276f9597b591f975971a1b0fcf ] Add current size of rule expressions to the boundary check. Fixes: 2c865a8a28a1 ("netfilter: nf_tables: add rule blob layout") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* netfilter: ipset: Add schedule point in call_ad().Kuniyuki Iwashima2023-06-141-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 24e227896bbf003165e006732dccb3516f87f88e ] syzkaller found a repro that causes Hung Task [0] with ipset. The repro first creates an ipset and then tries to delete a large number of IPs from the ipset concurrently: IPSET_ATTR_IPADDR_IPV4 : 172.20.20.187 IPSET_ATTR_CIDR : 2 The first deleting thread hogs a CPU with nfnl_lock(NFNL_SUBSYS_IPSET) held, and other threads wait for it to be released. Previously, the same issue existed in set->variant->uadt() that could run so long under ip_set_lock(set). Commit 5e29dc36bd5e ("netfilter: ipset: Rework long task execution when adding/deleting entries") tried to fix it, but the issue still exists in the caller with another mutex. While adding/deleting many IPs, we should release the CPU periodically to prevent someone from abusing ipset to hang the system. Note we need to increment the ipset's refcnt to prevent the ipset from being destroyed while rescheduling. [0]: INFO: task syz-executor174:268 blocked for more than 143 seconds. Not tainted 6.4.0-rc1-00145-gba79e9a73284 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor174 state:D stack:0 pid:268 ppid:260 flags:0x0000000d Call trace: __switch_to+0x308/0x714 arch/arm64/kernel/process.c:556 context_switch kernel/sched/core.c:5343 [inline] __schedule+0xd84/0x1648 kernel/sched/core.c:6669 schedule+0xf0/0x214 kernel/sched/core.c:6745 schedule_preempt_disabled+0x58/0xf0 kernel/sched/core.c:6804 __mutex_lock_common kernel/locking/mutex.c:679 [inline] __mutex_lock+0x6fc/0xdb0 kernel/locking/mutex.c:747 __mutex_lock_slowpath+0x14/0x20 kernel/locking/mutex.c:1035 mutex_lock+0x98/0xf0 kernel/locking/mutex.c:286 nfnl_lock net/netfilter/nfnetlink.c:98 [inline] nfnetlink_rcv_msg+0x480/0x70c net/netfilter/nfnetlink.c:295 netlink_rcv_skb+0x1c0/0x350 net/netlink/af_netlink.c:2546 nfnetlink_rcv+0x18c/0x199c net/netfilter/nfnetlink.c:658 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x664/0x8cc net/netlink/af_netlink.c:1365 netlink_sendmsg+0x6d0/0xa4c net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0x4b8/0x810 net/socket.c:2503 ___sys_sendmsg net/socket.c:2557 [inline] __sys_sendmsg+0x1f8/0x2a4 net/socket.c:2586 __do_sys_sendmsg net/socket.c:2595 [inline] __se_sys_sendmsg net/socket.c:2593 [inline] __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2593 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall+0x84/0x270 arch/arm64/kernel/syscall.c:52 el0_svc_common+0x134/0x24c arch/arm64/kernel/syscall.c:142 do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:193 el0_svc+0x2c/0x7c arch/arm64/kernel/entry-common.c:637 el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 Reported-by: syzkaller <syzkaller@googlegroups.com> Fixes: a7b4f989a629 ("netfilter: ipset: IP set core support") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* netfilter: conntrack: fix NULL pointer dereference in nf_confirm_cthelperTijs Van Buggenhout2023-06-141-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit e1f543dc660b44618a1bd72ddb4ca0828a95f7ad ] An nf_conntrack_helper from nf_conn_help may become NULL after DNAT. Observed when TCP port 1720 (Q931_PORT), associated with h323 conntrack helper, is DNAT'ed to another destination port (e.g. 1730), while nfqueue is being used for final acceptance (e.g. snort). This happenned after transition from kernel 4.14 to 5.10.161. Workarounds: * keep the same port (1720) in DNAT * disable nfqueue * disable/unload h323 NAT helper $ linux-5.10/scripts/decode_stacktrace.sh vmlinux < /tmp/kernel.log BUG: kernel NULL pointer dereference, address: 0000000000000084 [..] RIP: 0010:nf_conntrack_update (net/netfilter/nf_conntrack_core.c:2080 net/netfilter/nf_conntrack_core.c:2134) nf_conntrack [..] nfqnl_reinject (net/netfilter/nfnetlink_queue.c:237) nfnetlink_queue nfqnl_recv_verdict (net/netfilter/nfnetlink_queue.c:1230) nfnetlink_queue nfnetlink_rcv_msg (net/netfilter/nfnetlink.c:241) nfnetlink [..] Fixes: ee04805ff54a ("netfilter: conntrack: make conntrack userspace helpers work again") Signed-off-by: Tijs Van Buggenhout <tijs.van.buggenhout@axsguard.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* netfilter: nft_bitwise: fix register trackingJeremy Sowden2023-06-141-1/+1
| | | | | | | | | | | | | | | [ Upstream commit 14e8b293903785590a0ef168745ac84250cb1f4c ] At the end of `nft_bitwise_reduce`, there is a loop which is intended to update the bitwise expression associated with each tracked destination register. However, currently, it just updates the first register repeatedly. Fix it. Fixes: 34cc9e52884a ("netfilter: nf_tables: cancel tracking for clobbered destination registers") Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* netfilter: nf_tables: Add null check for nla_nest_start_noflag() in ↵Gavrilov Ilia2023-06-141-0/+2
| | | | | | | | | | | | | | | | | | nft_dump_basechain_hook() [ Upstream commit bd058763a624a1fb5c20f3c46e632d623c043676 ] The nla_nest_start_noflag() function may fail and return NULL; the return value needs to be checked. Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org) with SVACE. Fixes: d54725cd11a5 ("netfilter: nf_tables: support for multiple devices per netdev hook") Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* wifi: cfg80211: fix locking in sched scan stop workJohannes Berg2023-06-141-2/+2
| | | | | | | | | | | [ Upstream commit 3e54ed8247c94c8bdf370bd872bd9dfe72b1b12b ] This should use wiphy_lock() now instead of acquiring the RTNL, since cfg80211_stop_sched_scan_req() now needs that. Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* wifi: mac80211: don't translate beacon/presp addrsJohannes Berg2023-06-141-1/+3
| | | | | | | | | | | | | | | | | | [ Upstream commit 47c171a426e305f2225b92ed7b5e0a990c95f6d4 ] Don't do link address translation for beacons and probe responses, this leads to reporting multiple scan list entries for the same AP (one with the MLD address) which just breaks things. We might need to extend this in the future for some other (action) frames that aren't MLD addressed. Fixes: 42fb9148c078 ("wifi: mac80211: do link->MLD address translation on RX") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230604120651.62adead1b43a.Ifc25eed26ebf3b269f60b1ec10060156d0e7ec0d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* wifi: mac80211: mlme: fix non-inheritence elementJohannes Berg2023-06-141-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 68c228557d52616cf040651abefda9839de7086a ] There were two bugs when creating the non-inheritence element: 1) 'at_extension' needs to be declared outside the loop, otherwise the value resets every iteration and we can never really switch properly 2) 'added' never got set to true, so we always cut off the extension element again at the end of the function This shows another issue that we might add a list but no extension list, but we need to make the extension list a zero-length one in that case. Fix all these issues. While at it, add a comment explaining the trim. Fixes: 81151ce462e5 ("wifi: mac80211: support MLO authentication/association with one link") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230604120651.3addaa5c4782.If3a78f9305997ad7ef4ba7ffc17a8234c956f613@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* wifi: cfg80211: reject bad AP MLD addressJohannes Berg2023-06-141-0/+2
| | | | | | | | | | | | | | | [ Upstream commit 727073ca5e55ab6a07df316250be8a12606e8677 ] When trying to authenticate, if the AP MLD address isn't a valid address, mac80211 can throw a warning. Avoid that by rejecting such addresses. Fixes: d648c23024bd ("wifi: nl80211: support MLO in auth/assoc") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230604120651.89188912bd1d.I8dbc6c8ee0cb766138803eec59508ef4ce477709@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* wifi: mac80211: use correct iftype HE capJohannes Berg2023-06-141-5/+10
| | | | | | | | | | | | | | | [ Upstream commit c37ab22bb1a43cdca8bf69cc0a22f1ccfc449e68 ] We already check that the right iftype capa exists, but then don't use it. Assign it to a variable so we can actually use it, and then do that. Fixes: bac2fd3d7534 ("mac80211: remove use of ieee80211_get_he_sta_cap()") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230604120651.0e908e5c5fdd.Iac142549a6144ac949ebd116b921a59ae5282735@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* Bluetooth: L2CAP: Add missing checks for invalid DCIDSungwoo Kim2023-06-141-0/+9
| | | | | | | | | | | | | | | [ Upstream commit 75767213f3d9b97f63694d02260b6a49a2271876 ] When receiving a connect response we should make sure that the DCID is within the valid range and that we don't already have another channel allocated for the same DCID. Missing checks may violate the specification (BLUETOOTH CORE SPECIFICATION Version 5.4 | Vol 3, Part A, Page 1046). Fixes: 40624183c202 ("Bluetooth: L2CAP: Add missing checks for invalid LE DCID") Signed-off-by: Sungwoo Kim <iam@sung-woo.kim> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* Bluetooth: ISO: use correct CIS order in Set CIG Parameters eventPauli Virtanen2023-06-141-18/+26
| | | | | | | | | | | | | | | | | | | | | [ Upstream commit 71e9588435c38112d6a8686d3d8e7cc1de8fe22c ] The order of CIS handle array in Set CIG Parameters response shall match the order of the CIS_ID array in the command (Core v5.3 Vol 4 Part E Sec 7.8.97). We send CIS_IDs mainly in the order of increasing CIS_ID (but with "last" CIS first if it has fixed CIG_ID). In handling of the reply, we currently assume this is also the same as the order of hci_conn in hdev->conn_hash, but that is not true. Match the correct hci_conn to the correct handle by matching them based on the CIG+CIS combination. The CIG+CIS combination shall be unique for ISO_LINK hci_conn at state >= BT_BOUND, which we maintain in hci_le_set_cig_params. Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* Bluetooth: hci_conn: Fix not matching by CIS IDLuiz Augusto von Dentz2023-06-141-1/+2
| | | | | | | | | | | [ Upstream commit c14516faede33c2c31da45cf950d55dbff42962e ] This fixes only matching CIS by address which prevents creating new hcon if upper layer is requesting a specific CIS ID. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Stable-dep-of: 71e9588435c3 ("Bluetooth: ISO: use correct CIS order in Set CIG Parameters event") Signed-off-by: Sasha Levin <sashal@kernel.org>
* Bluetooth: hci_conn: Add support for linking multiple hconLuiz Augusto von Dentz2023-06-143-95/+160
| | | | | | | | | | | | | | | | | | [ Upstream commit 06149746e7203d5ffe2d6faf9799ee36203aa8b8 ] Since it is required for some configurations to have multiple CIS with the same peer which is now covered by iso-tester in the following test cases: ISO AC 6(i) - Success ISO AC 7(i) - Success ISO AC 8(i) - Success ISO AC 9(i) - Success ISO AC 11(i) - Success Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Stable-dep-of: 71e9588435c3 ("Bluetooth: ISO: use correct CIS order in Set CIG Parameters event") Signed-off-by: Sasha Levin <sashal@kernel.org>
* Bluetooth: ISO: don't try to remove CIG if there are bound CIS leftPauli Virtanen2023-06-141-0/+2
| | | | | | | | | | | | | [ Upstream commit 6c242c64a09e78349fb0a5f0a6f8076a3d7c0bb4 ] Consider existing BOUND & CONNECT state CIS to block CIG removal. Otherwise, under suitable timing conditions we may attempt to remove CIG while Create CIS is pending, which fails. Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* Bluetooth: Fix l2cap_disconnect_req deadlockYing Hsu2023-06-141-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 02c5ea5246a44d6ffde0fddebfc1d56188052976 ] L2CAP assumes that the locks conn->chan_lock and chan->lock are acquired in the order conn->chan_lock, chan->lock to avoid potential deadlock. For example, l2sock_shutdown acquires these locks in the order: mutex_lock(&conn->chan_lock) l2cap_chan_lock(chan) However, l2cap_disconnect_req acquires chan->lock in l2cap_get_chan_by_scid first and then acquires conn->chan_lock before calling l2cap_chan_del. This means that these locks are acquired in unexpected order, which leads to potential deadlock: l2cap_chan_lock(c) mutex_lock(&conn->chan_lock) This patch releases chan->lock before acquiring the conn_chan_lock to avoid the potential deadlock. Fixes: a2a9339e1c9d ("Bluetooth: L2CAP: Fix use-after-free in l2cap_disconnect_{req,rsp}") Signed-off-by: Ying Hsu <yinghsu@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* Bluetooth: hci_sync: add lock to protect HCI_UNREGISTERZhengping Jiang2023-06-142-6/+16
| | | | | | | | | | | | | [ Upstream commit 1857c19941c87eb36ad47f22a406be5dfe5eff9f ] When the HCI_UNREGISTER flag is set, no jobs should be scheduled. Fix potential race when HCI_UNREGISTER is set after the flag is tested in hci_cmd_sync_queue. Fixes: 0b94f2651f56 ("Bluetooth: hci_sync: Fix queuing commands when HCI_UNREGISTER is set") Signed-off-by: Zhengping Jiang <jiangzp@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* Bluetooth: ISO: Fix CIG auto-allocation to select configurable CIGPauli Virtanen2023-06-141-7/+6
| | | | | | | | | | | | | | | | | [ Upstream commit e6a7a46b8636efe95c75bed63a57fc05c13feba4 ] Make CIG auto-allocation to select the first CIG_ID that is still configurable. Also use correct CIG_ID range (see Core v5.3 Vol 4 Part E Sec 7.8.97 p.2553). Previously, it would always select CIG_ID 0 regardless of anything, because cis_list with data.cis == 0xff (BT_ISO_QOS_CIS_UNSET) would not count any CIS. Since we are not adding CIS here, use find_cis instead. Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* Bluetooth: ISO: consider right CIS when removing CIG at cleanupPauli Virtanen2023-06-141-2/+5
| | | | | | | | | | | | [ Upstream commit 31c5f9164949347c9cb34f041a7e04fdc08b1b85 ] When looking for CIS blocking CIG removal, consider only the CIS with the right CIG ID. Don't try to remove CIG with unset CIG ID. Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* Bluetooth: Split bt_iso_qos into dedicated structuresIulia Tanasescu2023-06-143-118/+202
| | | | | | | | | | | | | [ Upstream commit 0fe8c8d071343fa9278980ce4b6f8e6ea24a2ed1 ] Split bt_iso_qos into dedicated unicast and broadcast structures and add additional broadcast parameters. Fixes: eca0ae4aea66 ("Bluetooth: Add initial implementation of BIS connections") Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Stable-dep-of: 31c5f9164949 ("Bluetooth: ISO: consider right CIS when removing CIG at cleanup") Signed-off-by: Sasha Levin <sashal@kernel.org>
* net/sched: fq_pie: ensure reasonable TCA_FQ_PIE_QUANTUM valuesEric Dumazet2023-06-141-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit cd2b8113c2e8b9f5a88a942e1eaca61eba401b85 ] We got multiple syzbot reports, all duplicates of the following [1] syzbot managed to install fq_pie with a zero TCA_FQ_PIE_QUANTUM, thus triggering infinite loops. Use limits similar to sch_fq, with commits 3725a269815b ("pkt_sched: fq: avoid hang when quantum 0") and d9e15a273306 ("pkt_sched: fq: do not accept silly TCA_FQ_QUANTUM") [1] watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [swapper/0:0] Modules linked in: irq event stamp: 172817 hardirqs last enabled at (172816): [<ffff80001242fde4>] __el1_irq arch/arm64/kernel/entry-common.c:476 [inline] hardirqs last enabled at (172816): [<ffff80001242fde4>] el1_interrupt+0x58/0x68 arch/arm64/kernel/entry-common.c:486 hardirqs last disabled at (172817): [<ffff80001242fdb0>] __el1_irq arch/arm64/kernel/entry-common.c:468 [inline] hardirqs last disabled at (172817): [<ffff80001242fdb0>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:486 softirqs last enabled at (167634): [<ffff800008020c1c>] softirq_handle_end kernel/softirq.c:414 [inline] softirqs last enabled at (167634): [<ffff800008020c1c>] __do_softirq+0xac0/0xd54 kernel/softirq.c:600 softirqs last disabled at (167701): [<ffff80000802a660>] ____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:80 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.0-rc3-syzkaller-geb0f1697d729 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/28/2023 pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : fq_pie_qdisc_dequeue+0x10c/0x8ac net/sched/sch_fq_pie.c:246 lr : fq_pie_qdisc_dequeue+0xe4/0x8ac net/sched/sch_fq_pie.c:240 sp : ffff800008007210 x29: ffff800008007280 x28: ffff0000c86f7890 x27: ffff0000cb20c2e8 x26: ffff0000cb20c2f0 x25: dfff800000000000 x24: ffff0000cb20c2e0 x23: ffff0000c86f7880 x22: 0000000000000040 x21: 1fffe000190def10 x20: ffff0000cb20c2e0 x19: ffff0000cb20c2e0 x18: ffff800008006e60 x17: 0000000000000000 x16: ffff80000850af6c x15: 0000000000000302 x14: 0000000000000100 x13: 0000000000000000 x12: 0000000000000001 x11: 0000000000000302 x10: 0000000000000100 x9 : 0000000000000000 x8 : 0000000000000000 x7 : ffff80000841c468 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000001 x3 : 0000000000000000 x2 : ffff0000cb20c2e0 x1 : ffff0000cb20c2e0 x0 : 0000000000000001 Call trace: fq_pie_qdisc_dequeue+0x10c/0x8ac net/sched/sch_fq_pie.c:246 dequeue_skb net/sched/sch_generic.c:292 [inline] qdisc_restart net/sched/sch_generic.c:397 [inline] __qdisc_run+0x1fc/0x231c net/sched/sch_generic.c:415 __dev_xmit_skb net/core/dev.c:3868 [inline] __dev_queue_xmit+0xc80/0x3318 net/core/dev.c:4210 dev_queue_xmit include/linux/netdevice.h:3085 [inline] neigh_connected_output+0x2f8/0x38c net/core/neighbour.c:1581 neigh_output include/net/neighbour.h:544 [inline] ip6_finish_output2+0xd60/0x1a1c net/ipv6/ip6_output.c:134 __ip6_finish_output net/ipv6/ip6_output.c:195 [inline] ip6_finish_output+0x538/0x8c8 net/ipv6/ip6_output.c:206 NF_HOOK_COND include/linux/netfilter.h:292 [inline] ip6_output+0x270/0x594 net/ipv6/ip6_output.c:227 dst_output include/net/dst.h:458 [inline] NF_HOOK include/linux/netfilter.h:303 [inline] ndisc_send_skb+0xc30/0x1790 net/ipv6/ndisc.c:508 ndisc_send_rs+0x47c/0x5d4 net/ipv6/ndisc.c:718 addrconf_rs_timer+0x300/0x58c net/ipv6/addrconf.c:3936 call_timer_fn+0x19c/0x8cc kernel/time/timer.c:1700 expire_timers kernel/time/timer.c:1751 [inline] __run_timers+0x55c/0x734 kernel/time/timer.c:2022 run_timer_softirq+0x7c/0x114 kernel/time/timer.c:2035 __do_softirq+0x2d0/0xd54 kernel/softirq.c:571 ____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:80 call_on_irq_stack+0x24/0x4c arch/arm64/kernel/entry.S:882 do_softirq_own_stack+0x20/0x2c arch/arm64/kernel/irq.c:85 invoke_softirq kernel/softirq.c:452 [inline] __irq_exit_rcu+0x28c/0x534 kernel/softirq.c:650 irq_exit_rcu+0x14/0x84 kernel/softirq.c:662 __el1_irq arch/arm64/kernel/entry-common.c:472 [inline] el1_interrupt+0x38/0x68 arch/arm64/kernel/entry-common.c:486 el1h_64_irq_handler+0x18/0x24 arch/arm64/kernel/entry-common.c:491 el1h_64_irq+0x64/0x68 arch/arm64/kernel/entry.S:587 __daif_local_irq_enable arch/arm64/include/asm/irqflags.h:33 [inline] arch_local_irq_enable+0x8/0xc arch/arm64/include/asm/irqflags.h:55 cpuidle_idle_call kernel/sched/idle.c:170 [inline] do_idle+0x1f0/0x4e8 kernel/sched/idle.c:282 cpu_startup_entry+0x24/0x28 kernel/sched/idle.c:379 rest_init+0x2dc/0x2f4 init/main.c:735 start_kernel+0x0/0x55c init/main.c:834 start_kernel+0x3f0/0x55c init/main.c:1088 __primary_switched+0xb8/0xc0 arch/arm64/kernel/head.S:523 Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
* net/smc: Avoid to access invalid RMBs' MRs in SMCRv1 ADD LINK CONTWen Gu2023-06-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit c308e9ec004721a656c193243eab61a8be324657 ] SMCRv1 has a similar issue to SMCRv2 (see link below) that may access invalid MRs of RMBs when construct LLC ADD LINK CONT messages. BUG: kernel NULL pointer dereference, address: 0000000000000014 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 5 PID: 48 Comm: kworker/5:0 Kdump: loaded Tainted: G W E 6.4.0-rc3+ #49 Workqueue: events smc_llc_add_link_work [smc] RIP: 0010:smc_llc_add_link_cont+0x160/0x270 [smc] RSP: 0018:ffffa737801d3d50 EFLAGS: 00010286 RAX: ffff964f82144000 RBX: ffffa737801d3dd8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff964f81370c30 RBP: ffffa737801d3dd4 R08: ffff964f81370000 R09: ffffa737801d3db0 R10: 0000000000000001 R11: 0000000000000060 R12: ffff964f82e70000 R13: ffff964f81370c38 R14: ffffa737801d3dd3 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff9652bfd40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000014 CR3: 000000008fa20004 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> smc_llc_srv_rkey_exchange+0xa7/0x190 [smc] smc_llc_srv_add_link+0x3ae/0x5a0 [smc] smc_llc_add_link_work+0xb8/0x140 [smc] process_one_work+0x1e5/0x3f0 worker_thread+0x4d/0x2f0 ? __pfx_worker_thread+0x10/0x10 kthread+0xe5/0x120 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 </TASK> When an alernate RNIC is available in system, SMC will try to add a new link based on the RNIC for resilience. All the RMBs in use will be mapped to the new link. Then the RMBs' MRs corresponding to the new link will be filled into LLC messages. For SMCRv1, they are ADD LINK CONT messages. However smc_llc_add_link_cont() may mistakenly access to unused RMBs which haven't been mapped to the new link and have no valid MRs, thus causing a crash. So this patch fixes it. Fixes: 87f88cda2128 ("net/smc: rkey processing for a new link as SMC client") Link: https://lore.kernel.org/r/1685101741-74826-3-git-send-email-guwen@linux.alibaba.com Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
* net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294Akihiro Suda2023-06-141-4/+4
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit e209fee4118fe9a449d4d805361eb2de6796be39 ] With this commit, all the GIDs ("0 4294967294") can be written to the "net.ipv4.ping_group_range" sysctl. Note that 4294967295 (0xffffffff) is an invalid GID (see gid_valid() in include/linux/uidgid.h), and an attempt to register this number will cause -EINVAL. Prior to this commit, only up to GID 2147483647 could be covered. Documentation/networking/ip-sysctl.rst had "0 4294967295" as an example value, but this example was wrong and causing -EINVAL. Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Co-developed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
* bpf, sockmap: Avoid potential NULL dereference in sk_psock_verdict_data_ready()Eric Dumazet2023-06-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit b320a45638296b63be8d9a901ca8bc43716b1ae1 ] syzbot found sk_psock(sk) could return NULL when called from sk_psock_verdict_data_ready(). Just make sure to handle this case. [1] general protection fault, probably for non-canonical address 0xdffffc000000005c: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x00000000000002e0-0x00000000000002e7] CPU: 0 PID: 15 Comm: ksoftirqd/0 Not tainted 6.4.0-rc3-syzkaller-00588-g4781e965e655 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/16/2023 RIP: 0010:sk_psock_verdict_data_ready+0x19f/0x3c0 net/core/skmsg.c:1213 Code: 4c 89 e6 e8 63 70 5e f9 4d 85 e4 75 75 e8 19 74 5e f9 48 8d bb e0 02 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 07 02 00 00 48 89 ef ff 93 e0 02 00 00 e8 29 fd RSP: 0018:ffffc90000147688 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000100 RDX: 000000000000005c RSI: ffffffff8825ceb7 RDI: 00000000000002e0 RBP: ffff888076518c40 R08: 0000000000000007 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000008000 R15: ffff888076518c40 FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f901375bab0 CR3: 000000004bf26000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_data_ready+0x10a/0x520 net/ipv4/tcp_input.c:5006 tcp_data_queue+0x25d3/0x4c50 net/ipv4/tcp_input.c:5080 tcp_rcv_established+0x829/0x1f90 net/ipv4/tcp_input.c:6019 tcp_v4_do_rcv+0x65a/0x9c0 net/ipv4/tcp_ipv4.c:1726 tcp_v4_rcv+0x2cbf/0x3340 net/ipv4/tcp_ipv4.c:2148 ip_protocol_deliver_rcu+0x9f/0x480 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x2ec/0x520 net/ipv4/ip_input.c:233 NF_HOOK include/linux/netfilter.h:303 [inline] NF_HOOK include/linux/netfilter.h:297 [inline] ip_local_deliver+0x1ae/0x200 net/ipv4/ip_input.c:254 dst_input include/net/dst.h:468 [inline] ip_rcv_finish+0x1cf/0x2f0 net/ipv4/ip_input.c:449 NF_HOOK include/linux/netfilter.h:303 [inline] NF_HOOK include/linux/netfilter.h:297 [inline] ip_rcv+0xae/0xd0 net/ipv4/ip_input.c:569 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5491 __netif_receive_skb+0x1f/0x1c0 net/core/dev.c:5605 process_backlog+0x101/0x670 net/core/dev.c:5933 __napi_poll+0xb7/0x6f0 net/core/dev.c:6499 napi_poll net/core/dev.c:6566 [inline] net_rx_action+0x8a9/0xcb0 net/core/dev.c:6699 __do_softirq+0x1d4/0x905 kernel/softirq.c:571 run_ksoftirqd kernel/softirq.c:939 [inline] run_ksoftirqd+0x31/0x60 kernel/softirq.c:931 smpboot_thread_fn+0x659/0x9e0 kernel/smpboot.c:164 kthread+0x344/0x440 kernel/kthread.c:379 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308 </TASK> Fixes: 6df7f764cd3c ("bpf, sockmap: Wake up polling after data copy") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20230530195149.68145-1-edumazet@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* mptcp: fix active subflow finalizationPaolo Abeni2023-06-091-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | | commit 55b47ca7d80814ceb63d64e032e96cd6777811e5 upstream. Active subflow are inserted into the connection list at creation time. When the MPJ handshake completes successfully, a new subflow creation netlink event is generated correctly, but the current code wrongly avoid initializing a couple of subflow data. The above will cause misbehavior on a few exceptional events: unneeded mptcp-level retransmission on msk-level sequence wrap-around and infinite mapping fallback even when a MPJ socket is present. Address the issue factoring out the needed initialization in a new helper and invoking the latter from __mptcp_finish_join() time for passive subflow and from mptcp_finish_join() for active ones. Fixes: 0530020a7c8f ("mptcp: track and update contiguous data status") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* mptcp: fix connect timeout handlingPaolo Abeni2023-06-092-23/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 786fc12457268cc9b555dde6c22ae7300d4b40e1 upstream. Ondrej reported a functional issue WRT timeout handling on connect with a nice reproducer. The problem is that the current mptcp connect waits for both the MPTCP socket level timeout, and the first subflow socket timeout. The latter is not influenced/touched by the exposed setsockopt(). Overall the above makes the SO_SNDTIMEO a no-op on connect. Since mptcp_connect is invoked via inet_stream_connect and the latter properly handle the MPTCP level timeout, we can address the issue making the nested subflow level connect always unblocking. This also allow simplifying a bit the code, dropping an ugly hack to handle the fastopen and custom proto_ops connect. The issues predates the blamed commit below, but the current resolution requires the infrastructure introduced there. Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()") Reported-by: Ondrej Mosnacek <omosnace@redhat.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399 Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* atm: hide unused procfs functionsArnd Bergmann2023-06-091-0/+2
| | | | | | | | | | | | | | | | | | | [ Upstream commit fb1b7be9b16c1f4626969ba4e95a97da2a452b41 ] When CONFIG_PROC_FS is disabled, the function declarations for some procfs functions are hidden, but the definitions are still build, as shown by this compiler warning: net/atm/resources.c:403:7: error: no previous prototype for 'atm_dev_seq_start' [-Werror=missing-prototypes] net/atm/resources.c:409:6: error: no previous prototype for 'atm_dev_seq_stop' [-Werror=missing-prototypes] net/atm/resources.c:414:7: error: no previous prototype for 'atm_dev_seq_next' [-Werror=missing-prototypes] Add another #ifdef to leave these out of the build. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230516194625.549249-2-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* netfilter: conntrack: define variables exp_nat_nla_policy and any_addr with ↵Tom Rix2023-06-091-0/+4
| | | | | | | | | | | | | | | | | | | | | | | CONFIG_NF_NAT [ Upstream commit 224a876e37543eee111bf9b6aa4935080e619335 ] gcc with W=1 and ! CONFIG_NF_NAT net/netfilter/nf_conntrack_netlink.c:3463:32: error: ‘exp_nat_nla_policy’ defined but not used [-Werror=unused-const-variable=] 3463 | static const struct nla_policy exp_nat_nla_policy[CTA_EXPECT_NAT_MAX+1] = { | ^~~~~~~~~~~~~~~~~~ net/netfilter/nf_conntrack_netlink.c:2979:33: error: ‘any_addr’ defined but not used [-Werror=unused-const-variable=] 2979 | static const union nf_inet_addr any_addr; | ^~~~~~~~ These variables use is controlled by CONFIG_NF_NAT, so should their definitions. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* wifi: mac80211: recalc chanctx mindef before assigningJohannes Berg2023-06-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 04312de4ced4b152749614e8179f3978a20a992f ] When we allocate a new channel context, or find an existing one that is compatible, we currently assign it to a link before its mindef is updated. This leads to strange situations, especially in link switching where you switch to an 80 MHz link and expect it to be active immediately, but the mindef is still configured to 20 MHz while assigning. Also, it's strange that the chandef passed to the assign method's argument is wider than the one in the context. Fix this by calculating the mindef with the new link considered before calling the driver. In particular, this fixes an iwlwifi problem during link switch where the firmware would assert because the (link) station that was added for the AP is configured to transmit at a bandwidth that's wider than the channel context that it's configured on. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230504134511.828474-5-gregory.greenman@intel.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* wifi: mac80211: consider reserved chanctx for mindefJohannes Berg2023-06-093-30/+47
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit b72a455a2409fd94d6d9b4eb51d659a88213243b ] When a chanctx is reserved for a new vif and we recalculate the minimal definition for it, we need to consider the new interface it's being reserved for before we assign it, so it can be used directly with the correct min channel width. Fix the code to - optionally - consider that, and use that option just before doing the reassignment. Also, when considering channel context reservations, we should only consider the one link we're currently working with. Change the boolean argument to a link pointer to do that. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230504134511.828474-4-gregory.greenman@intel.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>