summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'net-6.2-rc7' of ↵Linus Torvalds2023-02-0268-272/+599
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, can and netfilter. Current release - regressions: - phy: fix null-deref in phy_attach_direct - mac802154: fix possible double free upon parsing error Previous releases - regressions: - bpf: preserve reg parent/live fields when copying range info, prevent mis-verification of programs as safe - ip6: fix GRE tunnels not generating IPv6 link local addresses - phy: dp83822: fix null-deref on DP83825/DP83826 devices - sctp: do not check hb_timer.expires when resetting hb_timer - eth: mtk_sock: fix SGMII configuration after phylink conversion Previous releases - always broken: - eth: xdp: execute xdp_do_flush() before napi_complete_done() - skb: do not mix page pool and page referenced frags in GRO - bpf: - fix a possible task gone issue with bpf_send_signal[_thread]() - fix an off-by-one bug in bpf_mem_cache_idx() to select the right cache - add missing btf_put to register_btf_id_dtor_kfuncs - sockmap: fon't let sock_map_{close,destroy,unhash} call itself - gso: fix null-deref in skb_segment_list() - mctp: purge receive queues on sk destruction - fix UaF caused by accept on already connected socket in exotic socket families - tls: don't treat list head as an entry in tls_is_tx_ready() - netfilter: br_netfilter: disable sabotage_in hook after first suppression - wwan: t7xx: fix runtime PM implementation Misc: - MAINTAINERS: spring cleanup of networking maintainers" * tag 'net-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits) mtk_sgmii: enable PCS polling to allow SFP work net: mediatek: sgmii: fix duplex configuration net: mediatek: sgmii: ensure the SGMII PHY is powered down on configuration MAINTAINERS: update SCTP maintainers MAINTAINERS: ipv6: retire Hideaki Yoshifuji mailmap: add John Crispin's entry MAINTAINERS: bonding: move Veaceslav Falico to CREDITS net: openvswitch: fix flow memory leak in ovs_flow_cmd_new net: ethernet: mtk_eth_soc: disable hardware DSA untagging for second MAC virtio-net: Keep stop() to follow mirror sequence of open() selftests: net: udpgso_bench_tx: Cater for pending datagrams zerocopy benchmarking selftests: net: udpgso_bench: Fix racing bug between the rx/tx programs selftests: net: udpgso_bench_rx/tx: Stop when wrong CLI args are provided selftests: net: udpgso_bench_rx: Fix 'used uninitialized' compiler warning can: mcp251xfd: mcp251xfd_ring_set_ringparam(): assign missing tx_obj_num_coalesce_irq can: isotp: split tx timer into transmission and timeout can: isotp: handle wait_event_interruptible() return values can: raw: fix CAN FD frame transmissions over CAN XL devices can: j1939: fix errant WARN_ON_ONCE in j1939_session_deactivate hv_netvsc: Fix missed pagebuf entries in netvsc_dma_map/unmap() ...
| * Merge branch 'fixes-for-mtk_eth_soc'Jakub Kicinski2023-02-022-15/+35
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bjørn Mork says: ==================== Fix mtk_eth_soc sgmii configuration. This has been tested on a MT7986 with a Maxlinear GPY211C phy permanently attached to the second SoC mac. ==================== Link: https://lore.kernel.org/r/20230201182331.943411-1-bjorn@mork.no Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * mtk_sgmii: enable PCS polling to allow SFP workAlexander Couzens2023-02-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently there is no IRQ handling (even the SGMII supports it). Enable polling to support SFP ports. Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs") Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Alexander Couzens <lynxis@fe80.eu> [ bmork: changed "1" => "true" ] Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Daniel Golle <daniel@makrotopia.org> Tested-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * net: mediatek: sgmii: fix duplex configurationBjørn Mork2023-02-022-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The logic of the duplex bit is inverted. Setting it means half duplex, not full duplex. Fix and rename macro to avoid confusion. Fixes: 7e538372694b ("net: ethernet: mediatek: Re-add support SGMII") Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Daniel Golle <daniel@makrotopia.org> Tested-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * net: mediatek: sgmii: ensure the SGMII PHY is powered down on configurationAlexander Couzens2023-02-022-11/+30
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code expect the PHY to be in power down which is only true after reset. Allow changes of the SGMII parameters more than once. Only power down when reconfiguring to avoid bouncing the link when there's no reason to - based on code from Russell King. There are cases when the SGMII_PHYA_PWD register contains 0x9 which prevents SGMII from working. The SGMII still shows link but no traffic can flow. Writing 0x0 to the PHYA_PWD register fix the issue. 0x0 was taken from a good working state of the SGMII interface. Fixes: 42c03844e93d ("net-next: mediatek: add support for MediaTek MT7622 SoC") Suggested-by: Russell King (Oracle) <linux@armlinux.org.uk> Signed-off-by: Alexander Couzens <lynxis@fe80.eu> [ bmork: rebased and squashed into one patch ] Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Daniel Golle <daniel@makrotopia.org> Tested-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * Merge tag 'linux-can-fixes-for-6.2-20230202' of ↵Jakub Kicinski2023-02-024-56/+65
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== can 2023-02-02 The first patch is by Ziyang Xuan and removes a errant WARN_ON_ONCE() in the CAN J1939 protocol. The next 3 patches are by Oliver Hartkopp. The first 2 target the CAN ISO-TP protocol and fix the state machine with respect to signals and a regression found by the syzbot. The last patch is by me an missing assignment during the ethtool ring configuration callback. * tag 'linux-can-fixes-for-6.2-20230202' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: mcp251xfd: mcp251xfd_ring_set_ringparam(): assign missing tx_obj_num_coalesce_irq can: isotp: split tx timer into transmission and timeout can: isotp: handle wait_event_interruptible() return values can: raw: fix CAN FD frame transmissions over CAN XL devices can: j1939: fix errant WARN_ON_ONCE in j1939_session_deactivate ==================== Link: https://lore.kernel.org/r/20230202094135.2293939-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * can: mcp251xfd: mcp251xfd_ring_set_ringparam(): assign missing ↵Marc Kleine-Budde2023-02-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tx_obj_num_coalesce_irq If the a new ring layout is set, the max coalesced frames for RX and TX are re-calculated, too. Add the missing assignment of the newly calculated TX max coalesced frames. Fixes: 656fc12ddaf8 ("can: mcp251xfd: add TX IRQ coalescing ethtool support") Link: https://lore.kernel.org/all/20230130154334.1578518-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| | * can: isotp: split tx timer into transmission and timeoutOliver Hartkopp2023-02-021-36/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The timer for the transmission of isotp PDUs formerly had two functions: 1. send two consecutive frames with a given time gap 2. monitor the timeouts for flow control frames and the echo frames This led to larger txstate checks and potentially to a problem discovered by syzbot which enabled the panic_on_warn feature while testing. The former 'txtimer' function is split into 'txfrtimer' and 'txtimer' to handle the two above functionalities with separate timer callbacks. The two simplified timers now run in one-shot mode and make the state transitions (especially with isotp_rcv_echo) better understandable. Fixes: 866337865f37 ("can: isotp: fix tx state handling for echo tx processing") Reported-by: syzbot+5aed6c3aaba661f5b917@syzkaller.appspotmail.com Cc: stable@vger.kernel.org # >= v6.0 Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20230104145701.2422-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| | * can: isotp: handle wait_event_interruptible() return valuesOliver Hartkopp2023-02-021-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When wait_event_interruptible() has been interrupted by a signal the tx.state value might not be ISOTP_IDLE. Force the state machines into idle state to inhibit the timer handlers to continue working. Fixes: 866337865f37 ("can: isotp: fix tx state handling for echo tx processing") Cc: stable@vger.kernel.org Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20230112192347.1944-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| | * can: raw: fix CAN FD frame transmissions over CAN XL devicesOliver Hartkopp2023-02-021-16/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A CAN XL device is always capable to process CAN FD frames. The former check when sending CAN FD frames relied on the existence of a CAN FD device and did not check for a CAN XL device that would be correct too. With this patch the CAN FD feature is enabled automatically when CAN XL is switched on - and CAN FD cannot be switch off while CAN XL is enabled. This precondition also leads to a clean up and reduction of checks in the hot path in raw_rcv() and raw_sendmsg(). Some conditions are reordered to handle simple checks first. changes since v1: https://lore.kernel.org/all/20230131091012.50553-1-socketcan@hartkopp.net - fixed typo: devive -> device changes since v2: https://lore.kernel.org/all/20230131091824.51026-1-socketcan@hartkopp.net/ - reorder checks in if statements to handle simple checks first Fixes: 626332696d75 ("can: raw: add CAN XL support") Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20230131105613.55228-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| | * can: j1939: fix errant WARN_ON_ONCE in j1939_session_deactivateZiyang Xuan2023-02-021-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The conclusion "j1939_session_deactivate() should be called with a session ref-count of at least 2" is incorrect. In some concurrent scenarios, j1939_session_deactivate can be called with the session ref-count less than 2. But there is not any problem because it will check the session active state before session putting in j1939_session_deactivate_locked(). Here is the concurrent scenario of the problem reported by syzbot and my reproduction log. cpu0 cpu1 j1939_xtp_rx_eoma j1939_xtp_rx_abort_one j1939_session_get_by_addr [kref == 2] j1939_session_get_by_addr [kref == 3] j1939_session_deactivate [kref == 2] j1939_session_put [kref == 1] j1939_session_completed j1939_session_deactivate WARN_ON_ONCE(kref < 2) ===================================================== WARNING: CPU: 1 PID: 21 at net/can/j1939/transport.c:1088 j1939_session_deactivate+0x5f/0x70 CPU: 1 PID: 21 Comm: ksoftirqd/1 Not tainted 5.14.0-rc7+ #32 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014 RIP: 0010:j1939_session_deactivate+0x5f/0x70 Call Trace: j1939_session_deactivate_activate_next+0x11/0x28 j1939_xtp_rx_eoma+0x12a/0x180 j1939_tp_recv+0x4a2/0x510 j1939_can_recv+0x226/0x380 can_rcv_filter+0xf8/0x220 can_receive+0x102/0x220 ? process_backlog+0xf0/0x2c0 can_rcv+0x53/0xf0 __netif_receive_skb_one_core+0x67/0x90 ? process_backlog+0x97/0x2c0 __netif_receive_skb+0x22/0x80 Fixes: 0c71437dd50d ("can: j1939: j1939_session_deactivate(): clarify lifetime of session object") Reported-by: syzbot+9981a614060dcee6eeca@syzkaller.appspotmail.com Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/all/20210906094200.95868-1-william.xuanziyang@huawei.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| * | Merge branch 'maintainers-spring-refresh-of-networking-maintainers'Jakub Kicinski2023-02-023-3/+10
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jakub Kicinski says: ==================== MAINTAINERS: spring refresh of networking maintainers Use Jon Corbet's script for generating statistics about maintainer coverage to identify inactive maintainers of relatively active code. Move them to CREDITS. ==================== Link: https://lore.kernel.org/r/20230201182014.2362044-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * | MAINTAINERS: update SCTP maintainersJakub Kicinski2023-02-022-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vlad has stepped away from SCTP related duties. Move him to CREDITS and add Xin Long. Subsystem SCTP PROTOCOL Changes 237 / 629 (37%) Last activity: 2022-12-12 Vlad Yasevich <vyasevich@gmail.com>: Neil Horman <nhorman@tuxdriver.com>: Author 20a785aa52c8 2020-05-19 00:00:00 4 Tags 20a785aa52c8 2020-05-19 00:00:00 84 Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>: Author 557fb5862c92 2021-07-28 00:00:00 41 Tags da05cecc4939 2022-12-12 00:00:00 197 Top reviewers: [15]: lucien.xin@gmail.com INACTIVE MAINTAINER Vlad Yasevich <vyasevich@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * | MAINTAINERS: ipv6: retire Hideaki YoshifujiJakub Kicinski2023-02-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We very rarely hear from Hideaki Yoshifuji and the IPv4/IPv6 entry covers a lot of code. Asking people to CC someone who rarely responds feels wrong. Note that Hideaki Yoshifuji already has an entry in CREDITS for IPv6 so not adding another one. Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * | mailmap: add John Crispin's entryJakub Kicinski2023-02-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | John has not been CCed on some of the fixes which perhaps resulted in the lack of review tags: Subsystem MEDIATEK ETHERNET DRIVER Changes 50 / 295 (16%) Last activity: 2023-01-17 Felix Fietkau <nbd@nbd.name>: Author 8bd8dcc5e47f 2022-11-18 00:00:00 33 Tags 8bd8dcc5e47f 2022-11-18 00:00:00 38 John Crispin <john@phrozen.org>: Sean Wang <sean.wang@mediatek.com>: Author 880c2d4b2fdf 2019-06-03 00:00:00 7 Tags a5d75538295b 2020-04-07 00:00:00 10 Mark Lee <Mark-MC.Lee@mediatek.com>: Author 8d66a8183d0c 2019-11-14 00:00:00 4 Tags 8d66a8183d0c 2019-11-14 00:00:00 4 Lorenzo Bianconi <lorenzo@kernel.org>: Author 08a764a7c51b 2023-01-17 00:00:00 68 Tags 08a764a7c51b 2023-01-17 00:00:00 74 Top reviewers: [12]: leonro@nvidia.com [6]: f.fainelli@gmail.com [6]: andrew@lunn.ch INACTIVE MAINTAINER John Crispin <john@phrozen.org> map his old address to the up to date one. Acked-by: John Crispin <john@phrozen.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * | MAINTAINERS: bonding: move Veaceslav Falico to CREDITSJakub Kicinski2023-02-022-1/+4
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Veaceslav has stepped away from netdev: Subsystem BONDING DRIVER Changes 96 / 319 (30%) Last activity: 2022-12-01 Jay Vosburgh <j.vosburgh@gmail.com>: Author 4f5d33f4f798 2022-08-11 00:00:00 3 Tags e5214f363dab 2022-12-01 00:00:00 48 Veaceslav Falico <vfalico@gmail.com>: Andy Gospodarek <andy@greyhouse.net>: Tags 47f706262f1d 2019-02-24 00:00:00 4 Top reviewers: [42]: jay.vosburgh@canonical.com [18]: jiri@nvidia.com [10]: jtoppins@redhat.com INACTIVE MAINTAINER Veaceslav Falico <vfalico@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | net: openvswitch: fix flow memory leak in ovs_flow_cmd_newFedor Pchelkin2023-02-021-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Syzkaller reports a memory leak of new_flow in ovs_flow_cmd_new() as it is not freed when an allocation of a key fails. BUG: memory leak unreferenced object 0xffff888116668000 (size 632): comm "syz-executor231", pid 1090, jiffies 4294844701 (age 18.871s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000defa3494>] kmem_cache_zalloc include/linux/slab.h:654 [inline] [<00000000defa3494>] ovs_flow_alloc+0x19/0x180 net/openvswitch/flow_table.c:77 [<00000000c67d8873>] ovs_flow_cmd_new+0x1de/0xd40 net/openvswitch/datapath.c:957 [<0000000010a539a8>] genl_family_rcv_msg_doit+0x22d/0x330 net/netlink/genetlink.c:739 [<00000000dff3302d>] genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] [<00000000dff3302d>] genl_rcv_msg+0x328/0x590 net/netlink/genetlink.c:800 [<000000000286dd87>] netlink_rcv_skb+0x153/0x430 net/netlink/af_netlink.c:2515 [<0000000061fed410>] genl_rcv+0x24/0x40 net/netlink/genetlink.c:811 [<000000009dc0f111>] netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline] [<000000009dc0f111>] netlink_unicast+0x545/0x7f0 net/netlink/af_netlink.c:1339 [<000000004a5ee816>] netlink_sendmsg+0x8e7/0xde0 net/netlink/af_netlink.c:1934 [<00000000482b476f>] sock_sendmsg_nosec net/socket.c:651 [inline] [<00000000482b476f>] sock_sendmsg+0x152/0x190 net/socket.c:671 [<00000000698574ba>] ____sys_sendmsg+0x70a/0x870 net/socket.c:2356 [<00000000d28d9e11>] ___sys_sendmsg+0xf3/0x170 net/socket.c:2410 [<0000000083ba9120>] __sys_sendmsg+0xe5/0x1b0 net/socket.c:2439 [<00000000c00628f8>] do_syscall_64+0x30/0x40 arch/x86/entry/common.c:46 [<000000004abfdcf4>] entry_SYSCALL_64_after_hwframe+0x61/0xc6 To fix this the patch rearranges the goto labels to reflect the order of object allocations and adds appropriate goto statements on the error paths. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: 68bb10101e6b ("openvswitch: Fix flow lookup to use unmasked key") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230201210218.361970-1-pchelkin@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | net: ethernet: mtk_eth_soc: disable hardware DSA untagging for second MACArınç ÜNAL2023-02-021-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to my tests on MT7621AT and MT7623NI SoCs, hardware DSA untagging won't work on the second MAC. Therefore, disable this feature when the second MAC of the MT7621 and MT7623 SoCs is being used. Fixes: 2d7605a72906 ("net: ethernet: mtk_eth_soc: enable hardware DSA untagging") Link: https://lore.kernel.org/netdev/6249fc14-b38a-c770-36b4-5af6d41c21d3@arinc9.com/ Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com> Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Link: https://lore.kernel.org/r/20230128094232.2451947-1-arinc.unal@arinc9.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | virtio-net: Keep stop() to follow mirror sequence of open()Parav Pandit2023-02-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cited commit in fixes tag frees rxq xdp info while RQ NAPI is still enabled and packet processing may be ongoing. Follow the mirror sequence of open() in the stop() callback. This ensures that when rxq info is unregistered, no rx packet processing is ongoing. Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info") Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Parav Pandit <parav@nvidia.com> Link: https://lore.kernel.org/r/20230202163516.12559-1-parav@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | selftests: net: udpgso_bench_tx: Cater for pending datagrams zerocopy ↵Andrei Gherzan2023-02-021-7/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | benchmarking The test tool can check that the zerocopy number of completions value is valid taking into consideration the number of datagram send calls. This can catch the system into a state where the datagrams are still in the system (for example in a qdisk, waiting for the network interface to return a completion notification, etc). This change adds a retry logic of computing the number of completions up to a configurable (via CLI) timeout (default: 2 seconds). Fixes: 79ebc3c26010 ("net/udpgso_bench_tx: options to exercise TX CMSG") Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230201001612.515730-4-andrei.gherzan@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * | selftests: net: udpgso_bench: Fix racing bug between the rx/tx programsAndrei Gherzan2023-02-021-4/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "udpgro_bench.sh" invokes udpgso_bench_rx/udpgso_bench_tx programs subsequently and while doing so, there is a chance that the rx one is not ready to accept socket connections. This racing bug could fail the test with at least one of the following: ./udpgso_bench_tx: connect: Connection refused ./udpgso_bench_tx: sendmsg: Connection refused ./udpgso_bench_tx: write: Connection refused This change addresses this by making udpgro_bench.sh wait for the rx program to be ready before firing off the tx one - up to a 10s timeout. Fixes: 3a687bef148d ("selftests: udp gso benchmark") Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230201001612.515730-3-andrei.gherzan@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * | selftests: net: udpgso_bench_rx/tx: Stop when wrong CLI args are providedAndrei Gherzan2023-02-022-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Leaving unrecognized arguments buried in the output, can easily hide a CLI/script typo. Avoid this by exiting when wrong arguments are provided to the udpgso_bench test programs. Fixes: 3a687bef148d ("selftests: udp gso benchmark") Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230201001612.515730-2-andrei.gherzan@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * | selftests: net: udpgso_bench_rx: Fix 'used uninitialized' compiler warningAndrei Gherzan2023-02-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change fixes the following compiler warning: /usr/include/x86_64-linux-gnu/bits/error.h:40:5: warning: ‘gso_size’ may be used uninitialized [-Wmaybe-uninitialized] 40 | __error_noreturn (__status, __errnum, __format, __va_arg_pack ()); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ udpgso_bench_rx.c: In function ‘main’: udpgso_bench_rx.c:253:23: note: ‘gso_size’ was declared here 253 | int ret, len, gso_size, budget = 256; Fixes: 3327a9c46352 ("selftests: add functionals test for UDP GRO") Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230201001612.515730-1-andrei.gherzan@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * | hv_netvsc: Fix missed pagebuf entries in netvsc_dma_map/unmap()Michael Kelley2023-02-021-7/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | netvsc_dma_map() and netvsc_dma_unmap() currently check the cp_partial flag and adjust the page_count so that pagebuf entries for the RNDIS portion of the message are skipped when it has already been copied into a send buffer. But this adjustment has already been made by code in netvsc_send(). The duplicate adjustment causes some pagebuf entries to not be mapped. In a normal VM, this doesn't break anything because the mapping doesn’t change the PFN. But in a Confidential VM, dma_map_single() does bounce buffering and provides a different PFN. Failing to do the mapping causes the wrong PFN to be passed to Hyper-V, and various errors ensue. Fix this by removing the duplicate adjustment in netvsc_dma_map() and netvsc_dma_unmap(). Fixes: 846da38de0e8 ("net: netvsc: Add Isolation VM support for netvsc driver") Cc: stable@vger.kernel.org Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://lore.kernel.org/r/1675135986-254490-1-git-send-email-mikelley@microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * octeontx2-af: Fix devlink unregisterRatheesh Kannoth2023-02-011-8/+27
| | | | | | | | | | | | | | | | | | | | | | | | Exact match feature is only available in CN10K-B. Unregister exact match devlink entry only for this silicon variant. Fixes: 87e4ea29b030 ("octeontx2-af: Debugsfs support for exact match.") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230131061659.1025137-1-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * igc: return an error if the mac type is unknown in igc_ptp_systim_to_hwtstamp()Tom Rix2023-02-011-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | clang static analysis reports drivers/net/ethernet/intel/igc/igc_ptp.c:673:3: warning: The left operand of '+' is a garbage value [core.UndefinedBinaryOperatorResult] ktime_add_ns(shhwtstamps.hwtstamp, adjust); ^ ~~~~~~~~~~~~~~~~~~~~ igc_ptp_systim_to_hwtstamp() silently returns without setting the hwtstamp if the mac type is unknown. This should be treated as an error. Fixes: 81b055205e8b ("igc: Add support for RX timestamping") Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230131215437.1528994-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * nfp: flower: avoid taking mutex in atomic contextYanguo Li2023-02-011-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | A mutex may sleep, which is not permitted in atomic context. Avoid a case where this may arise by moving the to nfp_flower_lag_get_info_from_netdev() in nfp_tun_write_neigh() spinlock. Fixes: abc210952af7 ("nfp: flower: tunnel neigh support bond offload") Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Yanguo Li <yanguo.li@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230131080313.2076060-1-simon.horman@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * Merge branch ↵Jakub Kicinski2023-02-011-27/+32
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'ip-ip6_gre-fix-gre-tunnels-not-generating-ipv6-link-local-addresses' Thomas Winter says: ==================== ip/ip6_gre: Fix GRE tunnels not generating IPv6 link local addresses For our point-to-point GRE tunnels, they have IN6_ADDR_GEN_MODE_NONE when they are created then we set IN6_ADDR_GEN_MODE_EUI64 when they come up to generate the IPv6 link local address for the interface. Recently we found that they were no longer generating IPv6 addresses. Also, non-point-to-point tunnels were not generating any IPv6 link local address and instead generating an IPv6 compat address, breaking IPv6 communication on the tunnel. These failures were caused by commit e5dd729460ca and this patch set aims to resolve these issues. ==================== Link: https://lore.kernel.org/r/20230131034646.237671-1-Thomas.Winter@alliedtelesis.co.nz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * ip/ip6_gre: Fix non-point-to-point tunnel not generating IPv6 link local addressThomas Winter2023-02-011-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We recently found that our non-point-to-point tunnels were not generating any IPv6 link local address and instead generating an IPv6 compat address, breaking IPv6 communication on the tunnel. Previously, addrconf_gre_config always would call addrconf_addr_gen and generate a EUI64 link local address for the tunnel. Then commit e5dd729460ca changed the code path so that add_v4_addrs is called but this only generates a compat IPv6 address for non-point-to-point tunnels. I assume the compat address is specifically for SIT tunnels so have kept that only for SIT - GRE tunnels now always generate link local addresses. Fixes: e5dd729460ca ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address") Signed-off-by: Thomas Winter <Thomas.Winter@alliedtelesis.co.nz> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * ip/ip6_gre: Fix changing addr gen mode not generating IPv6 link local addressThomas Winter2023-02-011-22/+27
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For our point-to-point GRE tunnels, they have IN6_ADDR_GEN_MODE_NONE when they are created then we set IN6_ADDR_GEN_MODE_EUI64 when they come up to generate the IPv6 link local address for the interface. Recently we found that they were no longer generating IPv6 addresses. This issue would also have affected SIT tunnels. Commit e5dd729460ca changed the code path so that GRE tunnels generate an IPv6 address based on the tunnel source address. It also changed the code path so GRE tunnels don't call addrconf_addr_gen in addrconf_dev_config which is called by addrconf_sysctl_addr_gen_mode when the IN6_ADDR_GEN_MODE is changed. This patch aims to fix this issue by moving the code in addrconf_notify which calls the addr gen for GRE and SIT into a separate function and calling it in the places that expect the IPv6 address to be generated. The previous addrconf_dev_config is renamed to addrconf_eth_config since it only expected eth type interfaces and follows the addrconf_gre/sit_config format. A part of this changes means that the loopback address will be attempted to be configured when changing addr_gen_mode for lo. This should not be a problem because the address should exist anyway and if does already exist then no error is produced. Fixes: e5dd729460ca ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address") Signed-off-by: Thomas Winter <Thomas.Winter@alliedtelesis.co.nz> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski2023-01-312-2/+4
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Release bridge info once packet escapes the br_netfilter path, from Florian Westphal. 2) Revert incorrect fix for the SCTP connection tracking chunk iterator, also from Florian. First path fixes a long standing issue, the second path addresses a mistake in the previous pull request for net. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: Revert "netfilter: conntrack: fix bug in for_each_sctp_chunk" netfilter: br_netfilter: disable sabotage_in hook after first suppression ==================== Link: https://lore.kernel.org/r/20230131133158.4052-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * Revert "netfilter: conntrack: fix bug in for_each_sctp_chunk"Florian Westphal2023-01-311-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no bug. If sch->length == 0, this would result in an infinite loop, but first caller, do_basic_checks(), errors out in this case. After this change, packets with bogus zero-length chunks are no longer detected as invalid, so revert & add comment wrt. 0 length check. Fixes: 98ee00774525 ("netfilter: conntrack: fix bug in for_each_sctp_chunk") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: br_netfilter: disable sabotage_in hook after first suppressionFlorian Westphal2023-01-311-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using a xfrm interface in a bridged setup (the outgoing device is bridged), the incoming packets in the xfrm interface are only tracked in the outgoing direction. $ brctl show bridge name interfaces br_eth1 eth1 $ conntrack -L tcp 115 SYN_SENT src=192... dst=192... [UNREPLIED] ... If br_netfilter is enabled, the first (encrypted) packet is received onR eth1, conntrack hooks are called from br_netfilter emulation which allocates nf_bridge info for this skb. If the packet is for local machine, skb gets passed up the ip stack. The skb passes through ip prerouting a second time. br_netfilter ip_sabotage_in supresses the re-invocation of the hooks. After this, skb gets decrypted in xfrm layer and appears in network stack a second time (after decryption). Then, ip_sabotage_in is called again and suppresses netfilter hook invocation, even though the bridge layer never called them for the plaintext incarnation of the packet. Free the bridge info after the first suppression to avoid this. I was unable to figure out where the regression comes from, as far as i can see br_netfilter always had this problem; i did not expect that skb is looped again with different headers. Fixes: c4b0e771f906 ("netfilter: avoid using skb->nf_bridge directly") Reported-and-tested-by: Wolfgang Nothdurft <wolfgang@linogate.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | net: phy: meson-gxl: Add generic dummy stubs for MMD register accessChris Healy2023-01-311-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Meson G12A Internal PHY does not support standard IEEE MMD extended register access, therefore add generic dummy stubs to fail the read and write MMD calls. This is necessary to prevent the core PHY code from erroneously believing that EEE is supported by this PHY even though this PHY does not support EEE, as MMD register access returns all FFFFs. Fixes: 5c3407abb338 ("net: phy: meson-gxl: add g12a support") Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Chris Healy <healych@amazon.com> Reviewed-by: Jerome Brunet <jbrunet@baylibre.com> Link: https://lore.kernel.org/r/20230130231402.471493-1-cphealy@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | net: fix NULL pointer in skb_segment_listYan Zhai2023-01-311-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.") introduced UDP listifyed GRO. The segmentation relies on frag_list being untouched when passing through the network stack. This assumption can be broken sometimes, where frag_list itself gets pulled into linear area, leaving frag_list being NULL. When this happens it can trigger following NULL pointer dereference, and panic the kernel. Reverse the test condition should fix it. [19185.577801][ C1] BUG: kernel NULL pointer dereference, address: ... [19185.663775][ C1] RIP: 0010:skb_segment_list+0x1cc/0x390 ... [19185.834644][ C1] Call Trace: [19185.841730][ C1] <TASK> [19185.848563][ C1] __udp_gso_segment+0x33e/0x510 [19185.857370][ C1] inet_gso_segment+0x15b/0x3e0 [19185.866059][ C1] skb_mac_gso_segment+0x97/0x110 [19185.874939][ C1] __skb_gso_segment+0xb2/0x160 [19185.883646][ C1] udp_queue_rcv_skb+0xc3/0x1d0 [19185.892319][ C1] udp_unicast_rcv_skb+0x75/0x90 [19185.900979][ C1] ip_protocol_deliver_rcu+0xd2/0x200 [19185.910003][ C1] ip_local_deliver_finish+0x44/0x60 [19185.918757][ C1] __netif_receive_skb_one_core+0x8b/0xa0 [19185.927834][ C1] process_backlog+0x88/0x130 [19185.935840][ C1] __napi_poll+0x27/0x150 [19185.943447][ C1] net_rx_action+0x27e/0x5f0 [19185.951331][ C1] ? mlx5_cq_tasklet_cb+0x70/0x160 [mlx5_core] [19185.960848][ C1] __do_softirq+0xbc/0x25d [19185.968607][ C1] irq_exit_rcu+0x83/0xb0 [19185.976247][ C1] common_interrupt+0x43/0xa0 [19185.984235][ C1] asm_common_interrupt+0x22/0x40 ... [19186.094106][ C1] </TASK> Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.") Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Yan Zhai <yan@cloudflare.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/Y9gt5EUizK1UImEP@debian Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | net: fman: memac: free mdio device if lynx_pcs_create() failsVladimir Oltean2023-01-311-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When memory allocation fails in lynx_pcs_create() and it returns NULL, there remains a dangling reference to the mdiodev returned by of_mdio_find_device() which is leaked as soon as memac_pcs_create() returns empty-handed. Fixes: a7c2a32e7f22 ("net: fman: memac: Use lynx pcs driver") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Link: https://lore.kernel.org/r/20230130193051.563315-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | sctp: do not check hb_timer.expires when resetting hb_timerXin Long2023-01-311-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It tries to avoid the frequently hb_timer refresh in commit ba6f5e33bdbb ("sctp: avoid refreshing heartbeat timer too often"), and it only allows mod_timer when the new expires is after hb_timer.expires. It means even a much shorter interval for hb timer gets applied, it will have to wait until the current hb timer to time out. In sctp_do_8_2_transport_strike(), when a transport enters PF state, it expects to update the hb timer to resend a heartbeat every rto after calling sctp_transport_reset_hb_timer(), which will not work as the change mentioned above. The frequently hb_timer refresh was caused by sctp_transport_reset_timers() called in sctp_outq_flush() and it was already removed in the commit above. So we don't have to check hb_timer.expires when resetting hb_timer as it is now not called very often. Fixes: ba6f5e33bdbb ("sctp: avoid refreshing heartbeat timer too often") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://lore.kernel.org/r/d958c06985713ec84049a2d5664879802710179a.1675095933.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | net: sched: sch: Bounds check priorityKees Cook2023-01-311-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nothing was explicitly bounds checking the priority index used to access clpriop[]. WARN and bail out early if it's pathological. Seen with GCC 13: ../net/sched/sch_htb.c: In function 'htb_activate_prios': ../net/sched/sch_htb.c:437:44: warning: array subscript [0, 31] is outside array bounds of 'struct htb_prio[8]' [-Warray-bounds=] 437 | if (p->inner.clprio[prio].feed.rb_node) | ~~~~~~~~~~~~~~~^~~~~~ ../net/sched/sch_htb.c:131:41: note: while referencing 'clprio' 131 | struct htb_prio clprio[TC_HTB_NUMPRIO]; | ^~~~~~ Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Cong Wang <cong.wang@bytedance.com> Link: https://lore.kernel.org/r/20230127224036.never.561-kees@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * | net: ethernet: mtk_eth_soc: Avoid truncating allocationKees Cook2023-01-312-3/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There doesn't appear to be a reason to truncate the allocation used for flow_info, so do a full allocation and remove the unused empty struct. GCC does not like having a reference to an object that has been partially allocated, as bounds checking may become impossible when such an object is passed to other code. Seen with GCC 13: ../drivers/net/ethernet/mediatek/mtk_ppe.c: In function 'mtk_foe_entry_commit_subflow': ../drivers/net/ethernet/mediatek/mtk_ppe.c:623:18: warning: array subscript 'struct mtk_flow_entry[0]' is partly outside array bounds of 'unsigned char[48]' [-Warray-bounds=] 623 | flow_info->l2_data.base_flow = entry; | ^~ Cc: Felix Fietkau <nbd@nbd.name> Cc: John Crispin <john@phrozen.org> Cc: Sean Wang <sean.wang@mediatek.com> Cc: Mark Lee <Mark-MC.Lee@mediatek.com> Cc: Lorenzo Bianconi <lorenzo@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: netdev@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mediatek@lists.infradead.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230127223853.never.014-kees@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * Merge tag 'ieee802154-for-net-2023-01-30' of ↵Jakub Kicinski2023-01-301-1/+0
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan Stefan Schmidt says: ==================== ieee802154 for net 2023-01-30 Only one fix this time around. Miquel Raynal fixed a potential double free spotted by Dan Carpenter. * tag 'ieee802154-for-net-2023-01-30' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan: mac802154: Fix possible double free upon parsing error ==================== Link: https://lore.kernel.org/r/20230130095646.301448-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * mac802154: Fix possible double free upon parsing errorMiquel Raynal2022-12-191-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 4d1c7d87030b ("mac802154: Move an skb free within the rx path") tried to simplify error handling within the receive path by moving the kfree_skb() call at the very end of the top-level function but missed one kfree_skb() called upon frame parsing error. Prevent this possible double free from happening. Fixes: 4d1c7d87030b ("mac802154: Move an skb free within the rx path") Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/20221216235742.646134-1-miquel.raynal@bootlin.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
| * | net/tls: tls_is_tx_ready() checked list_entryPietro Borrello2023-01-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tls_is_tx_ready() checks that list_first_entry() does not return NULL. This condition can never happen. For empty lists, list_first_entry() returns the list_entry() of the head, which is a type confusion. Use list_first_entry_or_null() which returns NULL in case of empty lists. Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance") Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it> Link: https://lore.kernel.org/r/20230128-list-entry-null-check-tls-v1-1-525bbfe6f0d0@diag.uniroma1.it Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | Merge branch '100GbE' of ↵Jakub Kicinski2023-01-306-20/+44
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-01-27 (ice) This series contains updates to ice driver only. Dave prevents modifying channels when RDMA is active as this will break RDMA traffic. Michal fixes a broken URL. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Fix broken link in ice NAPI doc ice: Prevent set_channel from changing queues while RDMA active ==================== Link: https://lore.kernel.org/r/20230127225333.1534783-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * | ice: Fix broken link in ice NAPI docMichal Wilczynski2023-01-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current link for NAPI documentation in ice driver doesn't work - it returns 404. Update the link to the working one. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * | ice: Prevent set_channel from changing queues while RDMA activeDave Ertman2023-01-275-19/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PF controls the set of queues that the RDMA auxiliary_driver requests resources from. The set_channel command will alter that pool and trigger a reconfiguration of the VSI, which breaks RDMA functionality. Prevent set_channel from executing when RDMA driver bound to auxiliary device. Adding a locked variable to pass down the call chain to avoid double locking the device_lock. Fixes: 348048e724a0 ("ice: Implement iidc operations") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | | net: phy: fix null dereference in phy_attach_directColin Foster2023-01-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit bc66fa87d4fd ("net: phy: Add link between phy dev and mac dev") introduced a link between net devices and phy devices. It fails to check whether dev is NULL, leading to a NULL dereference error. Fixes: bc66fa87d4fd ("net: phy: Add link between phy dev and mac dev") Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | netrom: Fix use-after-free caused by accept on already connected socketHyunwoo Kim2023-01-301-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If you call listen() and accept() on an already connect()ed AF_NETROM socket, accept() can successfully connect. This is because when the peer socket sends data to sendmsg, the skb with its own sk stored in the connected socket's sk->sk_receive_queue is connected, and nr_accept() dequeues the skb waiting in the sk->sk_receive_queue. As a result, nr_accept() allocates and returns a sock with the sk of the parent AF_NETROM socket. And here use-after-free can happen through complex race conditions: ``` cpu0 cpu1 1. socket_2 = socket(AF_NETROM) . . listen(socket_2) accepted_socket = accept(socket_2) 2. socket_1 = socket(AF_NETROM) nr_create() // sk refcount : 1 connect(socket_1) 3. write(accepted_socket) nr_sendmsg() nr_output() nr_kick() nr_send_iframe() nr_transmit_buffer() nr_route_frame() nr_loopback_queue() nr_loopback_timer() nr_rx_frame() nr_process_rx_frame(sk, skb); // sk : socket_1's sk nr_state3_machine() nr_queue_rx_frame() sock_queue_rcv_skb() sock_queue_rcv_skb_reason() __sock_queue_rcv_skb() __skb_queue_tail(list, skb); // list : socket_1's sk->sk_receive_queue 4. listen(socket_1) nr_listen() uaf_socket = accept(socket_1) nr_accept() skb_dequeue(&sk->sk_receive_queue); 5. close(accepted_socket) nr_release() nr_write_internal(sk, NR_DISCREQ) nr_transmit_buffer() // NR_DISCREQ nr_route_frame() nr_loopback_queue() nr_loopback_timer() nr_rx_frame() // sk : socket_1's sk nr_process_rx_frame() // NR_STATE_3 nr_state3_machine() // NR_DISCREQ nr_disconnect() nr_sk(sk)->state = NR_STATE_0; 6. close(socket_1) // sk refcount : 3 nr_release() // NR_STATE_0 sock_put(sk); // sk refcount : 0 sk_free(sk); close(uaf_socket) nr_release() sock_hold(sk); // UAF ``` KASAN report by syzbot: ``` BUG: KASAN: use-after-free in nr_release+0x66/0x460 net/netrom/af_netrom.c:520 Write of size 4 at addr ffff8880235d8080 by task syz-executor564/5128 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:306 [inline] print_report+0x15e/0x461 mm/kasan/report.c:417 kasan_report+0xbf/0x1f0 mm/kasan/report.c:517 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0x141/0x190 mm/kasan/generic.c:189 instrument_atomic_read_write include/linux/instrumented.h:102 [inline] atomic_fetch_add_relaxed include/linux/atomic/atomic-instrumented.h:116 [inline] __refcount_add include/linux/refcount.h:193 [inline] __refcount_inc include/linux/refcount.h:250 [inline] refcount_inc include/linux/refcount.h:267 [inline] sock_hold include/net/sock.h:775 [inline] nr_release+0x66/0x460 net/netrom/af_netrom.c:520 __sock_release+0xcd/0x280 net/socket.c:650 sock_close+0x1c/0x20 net/socket.c:1365 __fput+0x27c/0xa90 fs/file_table.c:320 task_work_run+0x16f/0x270 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0xaa8/0x2950 kernel/exit.c:867 do_group_exit+0xd4/0x2a0 kernel/exit.c:1012 get_signal+0x21c3/0x2450 kernel/signal.c:2859 arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306 exit_to_user_mode_loop kernel/entry/common.c:168 [inline] exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296 do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f6c19e3c9b9 Code: Unable to access opcode bytes at 0x7f6c19e3c98f. RSP: 002b:00007fffd4ba2ce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: 0000000000000116 RBX: 0000000000000003 RCX: 00007f6c19e3c9b9 RDX: 0000000000000318 RSI: 00000000200bd000 RDI: 0000000000000006 RBP: 0000000000000003 R08: 000000000000000d R09: 000000000000000d R10: 0000000000000000 R11: 0000000000000246 R12: 000055555566a2c0 R13: 0000000000000011 R14: 0000000000000000 R15: 0000000000000000 </TASK> Allocated by task 5128: kasan_save_stack+0x22/0x40 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 ____kasan_kmalloc mm/kasan/common.c:371 [inline] ____kasan_kmalloc mm/kasan/common.c:330 [inline] __kasan_kmalloc+0xa3/0xb0 mm/kasan/common.c:380 kasan_kmalloc include/linux/kasan.h:211 [inline] __do_kmalloc_node mm/slab_common.c:968 [inline] __kmalloc+0x5a/0xd0 mm/slab_common.c:981 kmalloc include/linux/slab.h:584 [inline] sk_prot_alloc+0x140/0x290 net/core/sock.c:2038 sk_alloc+0x3a/0x7a0 net/core/sock.c:2091 nr_create+0xb6/0x5f0 net/netrom/af_netrom.c:433 __sock_create+0x359/0x790 net/socket.c:1515 sock_create net/socket.c:1566 [inline] __sys_socket_create net/socket.c:1603 [inline] __sys_socket_create net/socket.c:1588 [inline] __sys_socket+0x133/0x250 net/socket.c:1636 __do_sys_socket net/socket.c:1649 [inline] __se_sys_socket net/socket.c:1647 [inline] __x64_sys_socket+0x73/0xb0 net/socket.c:1647 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Freed by task 5128: kasan_save_stack+0x22/0x40 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:518 ____kasan_slab_free mm/kasan/common.c:236 [inline] ____kasan_slab_free+0x13b/0x1a0 mm/kasan/common.c:200 kasan_slab_free include/linux/kasan.h:177 [inline] __cache_free mm/slab.c:3394 [inline] __do_kmem_cache_free mm/slab.c:3580 [inline] __kmem_cache_free+0xcd/0x3b0 mm/slab.c:3587 sk_prot_free net/core/sock.c:2074 [inline] __sk_destruct+0x5df/0x750 net/core/sock.c:2166 sk_destruct net/core/sock.c:2181 [inline] __sk_free+0x175/0x460 net/core/sock.c:2192 sk_free+0x7c/0xa0 net/core/sock.c:2203 sock_put include/net/sock.h:1991 [inline] nr_release+0x39e/0x460 net/netrom/af_netrom.c:554 __sock_release+0xcd/0x280 net/socket.c:650 sock_close+0x1c/0x20 net/socket.c:1365 __fput+0x27c/0xa90 fs/file_table.c:320 task_work_run+0x16f/0x270 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0xaa8/0x2950 kernel/exit.c:867 do_group_exit+0xd4/0x2a0 kernel/exit.c:1012 get_signal+0x21c3/0x2450 kernel/signal.c:2859 arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306 exit_to_user_mode_loop kernel/entry/common.c:168 [inline] exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296 do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x63/0xcd ``` To fix this issue, nr_listen() returns -EINVAL for sockets that successfully nr_connect(). Reported-by: syzbot+caa188bdfc1eeafeb418@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: stmmac: do not stop RX_CLK in Rx LPI state for qcs404 SoCAndrey Konovalov2023-01-303-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently in phy_init_eee() the driver unconditionally configures the PHY to stop RX_CLK after entering Rx LPI state. This causes an LPI interrupt storm on my qcs404-base board. Change the PHY initialization so that for "qcom,qcs404-ethqos" compatible device RX_CLK continues to run even in Rx LPI state. Signed-off-by: Andrey Konovalov <andrey.konovalov@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | selftest: net: Improve IPV6_TCLASS/IPV6_HOPLIMIT tests apparmor compatibilityAndrei Gherzan2023-01-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "tcpdump" is used to capture traffic in these tests while using a random, temporary and not suffixed file for it. This can interfere with apparmor configuration where the tool is only allowed to read from files with 'known' extensions. The MINE type application/vnd.tcpdump.pcap was registered with IANA for pcap files and .pcap is the extension that is both most common but also aligned with standard apparmor configurations. See TCPDUMP(8) for more details. This improves compatibility with standard apparmor configurations by using ".pcap" as the file extension for the tests' temporary files. Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge branch 't7xx-pm-fixes'David S. Miller2023-01-284-11/+47
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kornel Dulęba says: ==================== net: wwan: t7xx: Fix Runtime PM implementation d10b3a695ba0 ("net: wwan: t7xx: Runtime PM") introduced support for Runtime PM for this driver, but due to a bug in the initialization logic the usage refcount would never reach 0, leaving the feature unused. This patchset addresses that, together with a bug found after runtime suspend was enabled. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>