summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* | ipvlan: Make skb->skb_iif track skb->dev for l3s modeJianguo Wu2023-03-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For l3s mode, skb->dev is set to ipvlan interface in ipvlan_nf_input(): skb->dev = addr->master->dev but, skb->skb_iif remain unchanged, this will cause socket lookup failed if a target socket is bound to a interface, like the following example: ip link add ipvlan0 link eth0 type ipvlan mode l3s ip addr add dev ipvlan0 192.168.124.111/24 ip link set ipvlan0 up ping -c 1 -I ipvlan0 8.8.8.8 100% packet loss This is because there is no match sk in __raw_v4_lookup() as sk->sk_bound_dev_if != dif(skb->skb_iif). Fix this by make skb->skb_iif track skb->dev in ipvlan_nf_input(). Fixes: c675e06a98a4 ("ipvlan: decouple l3s mode dependencies from other modes") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/29865b1f-6db7-c07a-de89-949d3721ea30@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | net: phy: nxp-c45-tja11xx: fix MII_BASIC_CONFIG_REV bitRadu Pirea (OSS)2023-03-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | According to the TJA1103 user manual, the bit for the reversed role in MII or RMII modes is bit 4. Cc: <stable@vger.kernel.org> # 5.15+ Fixes: b050f2f15e04 ("phy: nxp-c45: add driver for tja1103") Signed-off-by: Radu Pirea (OSS) <radu-nicolae.pirea@oss.nxp.com> Link: https://lore.kernel.org/r/20230309100111.1246214-1-radu-nicolae.pirea@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | Merge tag 'wireless-2023-03-10' of ↵Jakub Kicinski2023-03-102-21/+26
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Just a few fixes: * MLO connection socket ownership didn't work * basic rates validation was missing (reported by by a private syzbot instances) * puncturing bitmap netlink policy was completely broken * properly check chandef for NULL channel, it can be pointing to a chandef that's still uninitialized * tag 'wireless-2023-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: cfg80211: fix MLO connection ownership wifi: mac80211: check basic rates validity wifi: nl80211: fix puncturing bitmap policy wifi: nl80211: fix NULL-ptr deref in offchan check ==================== Link: https://lore.kernel.org/r/20230310114647.35422-1-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | wifi: cfg80211: fix MLO connection ownershipJohannes Berg2023-03-101-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When disconnecting from an MLO connection we need the AP MLD address, not an arbitrary BSSID. Fix the code to do that. Fixes: 9ecff10e82a5 ("wifi: nl80211: refactor BSS lookup in nl80211_associate()") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.4c1b3b18980e.I008f070c7f3b8e8bde9278101ef9e40706a82902@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | wifi: mac80211: check basic rates validityJohannes Berg2023-03-101-10/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When userspace sets basic rates, it might send us some rates list that's empty or consists of invalid values only. We're currently ignoring invalid values and then may end up with a rates bitmap that's empty, which later results in a warning. Reject the call if there were no valid rates. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | wifi: nl80211: fix puncturing bitmap policyJohannes Berg2023-03-101-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was meant to be a u32, and while applying the patch I tried to use policy validation for it. However, not only did I copy/paste it to u8 instead of u32, but also used the policy range erroneously. Fix both of these issues. Fixes: d7c1a9a0ed18 ("wifi: nl80211: validate and configure puncturing bitmap") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | wifi: nl80211: fix NULL-ptr deref in offchan checkJohannes Berg2023-03-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If, e.g. in AP mode, the link was already created by userspace but not activated yet, it has a chandef but the chandef isn't valid and has no channel. Check for this and ignore this link. Fixes: 7b0a0e3c3a88 ("wifi: cfg80211: do some rework towards MLO link APIs") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.71bd4803fbb9.Iee39c0f6c2d3a59a8227674dc55d52e38b1090cf@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
* | | MAINTAINERS: make my email address consistentJiri Pirko2023-03-102-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Use jiri@resnulli.us in all MAINTAINERS entries and fixup .mailmap so all other addresses point to that one. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Link: https://lore.kernel.org/r/20230309114911.923460-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | | Merge branch 'add-checking-sq-is-full-inside-xdp-xmit'Jakub Kicinski2023-03-101-70/+85
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Xuan Zhuo says: ==================== add checking sq is full inside xdp xmit If the queue of xdp xmit is not an independent queue, then when the xdp xmit used all the desc, the xmit from the __dev_queue_xmit() may encounter the following error. net ens4: Unexpected TXQ (0) queue failure: -28 This patch adds a check whether sq is full in XDP Xmit. ==================== Link: https://lore.kernel.org/r/20230308024935.91686-1-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | virtio_net: add checking sq is full inside xdp xmitXuan Zhuo2023-03-101-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the queue of xdp xmit is not an independent queue, then when the xdp xmit used all the desc, the xmit from the __dev_queue_xmit() may encounter the following error. net ens4: Unexpected TXQ (0) queue failure: -28 This patch adds a check whether sq is full in xdp xmit. Fixes: 56434a01b12e ("virtio_net: add XDP_TX support") Reported-by: Yichun Zhang <yichun@openresty.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | virtio_net: separate the logic of checking whether sq is fullXuan Zhuo2023-03-101-24/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Separate the logic of checking whether sq is full. The subsequent patch will reuse this func. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | virtio_net: reorder some funcsXuan Zhuo2023-03-101-46/+46
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | The purpose of this is to facilitate the subsequent addition of new functions without introducing a separate declaration. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | | nfc: pn533: initialize struct pn533_out_arg properlyFedor Pchelkin2023-03-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct pn533_out_arg used as a temporary context for out_urb is not initialized properly. Its uninitialized 'phy' field can be dereferenced in error cases inside pn533_out_complete() callback function. It causes the following failure: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.2.0-rc3-next-20230110-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 RIP: 0010:pn533_out_complete.cold+0x15/0x44 drivers/nfc/pn533/usb.c:441 Call Trace: <IRQ> __usb_hcd_giveback_urb+0x2b6/0x5c0 drivers/usb/core/hcd.c:1671 usb_hcd_giveback_urb+0x384/0x430 drivers/usb/core/hcd.c:1754 dummy_timer+0x1203/0x32d0 drivers/usb/gadget/udc/dummy_hcd.c:1988 call_timer_fn+0x1da/0x800 kernel/time/timer.c:1700 expire_timers+0x234/0x330 kernel/time/timer.c:1751 __run_timers kernel/time/timer.c:2022 [inline] __run_timers kernel/time/timer.c:1995 [inline] run_timer_softirq+0x326/0x910 kernel/time/timer.c:2035 __do_softirq+0x1fb/0xaf6 kernel/softirq.c:571 invoke_softirq kernel/softirq.c:445 [inline] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650 irq_exit_rcu+0x9/0x20 kernel/softirq.c:662 sysvec_apic_timer_interrupt+0x97/0xc0 arch/x86/kernel/apic/apic.c:1107 Initialize the field with the pn533_usb_phy currently used. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: 9dab880d675b ("nfc: pn533: Wait for out_urb's completion in pn533_usb_send_frame()") Reported-by: syzbot+1e608ba4217c96d1952f@syzkaller.appspotmail.com Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230309165050.207390-1-pchelkin@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | | tcp: tcp_make_synack() can be called from process contextBreno Leitao2023-03-091-1/+1
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tcp_rtx_synack() now could be called in process context as explained in 0a375c822497 ("tcp: tcp_rtx_synack() can be called from process context"). tcp_rtx_synack() might call tcp_make_synack(), which will touch per-CPU variables with preemption enabled. This causes the following BUG: BUG: using __this_cpu_add() in preemptible [00000000] code: ThriftIO1/5464 caller is tcp_make_synack+0x841/0xac0 Call Trace: <TASK> dump_stack_lvl+0x10d/0x1a0 check_preemption_disabled+0x104/0x110 tcp_make_synack+0x841/0xac0 tcp_v6_send_synack+0x5c/0x450 tcp_rtx_synack+0xeb/0x1f0 inet_rtx_syn_ack+0x34/0x60 tcp_check_req+0x3af/0x9e0 tcp_rcv_state_process+0x59b/0x2030 tcp_v6_do_rcv+0x5f5/0x700 release_sock+0x3a/0xf0 tcp_sendmsg+0x33/0x40 ____sys_sendmsg+0x2f2/0x490 __sys_sendmsg+0x184/0x230 do_syscall_64+0x3d/0x90 Avoid calling __TCP_INC_STATS() with will touch per-cpu variables. Use TCP_INC_STATS() which is safe to be called from context switch. Fixes: 8336886f786f ("tcp: TCP Fast Open Server - support TFO listeners") Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230308190745.780221-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | Merge tag 'net-6.3-rc2' of ↵Linus Torvalds2023-03-0992-356/+2599
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter and bpf. Current release - regressions: - core: avoid skb end_offset change in __skb_unclone_keeptruesize() - sched: - act_connmark: handle errno on tcf_idr_check_alloc - flower: fix fl_change() error recovery path - ieee802154: prevent user from crashing the host Current release - new code bugs: - eth: bnxt_en: fix the double free during device removal - tools: ynl: - fix enum-as-flags in the generic CLI - fully inherit attrs in subsets - re-license uniformly under GPL-2.0 or BSD-3-clause Previous releases - regressions: - core: use indirect calls helpers for sk_exit_memory_pressure() - tls: - fix return value for async crypto - avoid hanging tasks on the tx_lock - eth: ice: copy last block omitted in ice_get_module_eeprom() Previous releases - always broken: - core: avoid double iput when sock_alloc_file fails - af_unix: fix struct pid leaks in OOB support - tls: - fix possible race condition - fix device-offloaded sendpage straddling records - bpf: - sockmap: fix an infinite loop error - test_run: fix &xdp_frame misplacement for LIVE_FRAMES - fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR - netfilter: tproxy: fix deadlock due to missing BH disable - phylib: get rid of unnecessary locking - eth: bgmac: fix *initial* chip reset to support BCM5358 - eth: nfp: fix csum for ipsec offload - eth: mtk_eth_soc: fix RX data corruption issue Misc: - usb: qmi_wwan: add telit 0x1080 composition" * tag 'net-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (64 commits) tools: ynl: fix enum-as-flags in the generic CLI tools: ynl: move the enum classes to shared code net: avoid double iput when sock_alloc_file fails af_unix: fix struct pid leaks in OOB support eth: fealnx: bring back this old driver net: dsa: mt7530: permit port 5 to work without port 6 on MT7621 SoC net: microchip: sparx5: fix deletion of existing DSCP mappings octeontx2-af: Unlock contexts in the queue context cache in case of fault detection net/smc: fix fallback failed while sendmsg with fastopen ynl: re-license uniformly under GPL-2.0 OR BSD-3-Clause mailmap: update entries for Stephen Hemminger mailmap: add entry for Maxim Mikityanskiy nfc: change order inside nfc_se_io error path ethernet: ice: avoid gcc-9 integer overflow warning ice: don't ignore return codes in VSI related code ice: Fix DSCP PFC TLV creation net: usb: qmi_wwan: add Telit 0x1080 composition net: usb: cdc_mbim: avoid altsetting toggling for Telit FE990 netfilter: conntrack: adopt safer max chain length net: tls: fix device-offloaded sendpage straddling records ...
| * \ Merge branch '100GbE' of ↵Paolo Abeni2023-03-093-12/+15
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-03-07 (ice) This series contains updates to ice driver only. Dave removes masking from pfcena field as it was incorrectly preventing valid traffic classes from being enabled. Michal resolves various smatch issues such as not propagating error codes and returning 0 explicitly. Arnd Bergmann resolves gcc-9 warning for integer overflow. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ethernet: ice: avoid gcc-9 integer overflow warning ice: don't ignore return codes in VSI related code ice: Fix DSCP PFC TLV creation ==================== Link: https://lore.kernel.org/r/20230307220714.3997294-1-anthony.l.nguyen@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| | * | ethernet: ice: avoid gcc-9 integer overflow warningArnd Bergmann2023-03-071-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With older compilers like gcc-9, the calculation of the vlan priority field causes a false-positive warning from the byteswap: In file included from drivers/net/ethernet/intel/ice/ice_tc_lib.c:4: drivers/net/ethernet/intel/ice/ice_tc_lib.c: In function 'ice_parse_cls_flower': include/uapi/linux/swab.h:15:15: error: integer overflow in expression '(int)(short unsigned int)((int)match.key-><U67c8>.<U6698>.vlan_priority << 13) & 57344 & 255' of type 'int' results in '0' [-Werror=overflow] 15 | (((__u16)(x) & (__u16)0x00ffU) << 8) | \ | ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ include/uapi/linux/swab.h:106:2: note: in expansion of macro '___constant_swab16' 106 | ___constant_swab16(x) : \ | ^~~~~~~~~~~~~~~~~~ include/uapi/linux/byteorder/little_endian.h:42:43: note: in expansion of macro '__swab16' 42 | #define __cpu_to_be16(x) ((__force __be16)__swab16((x))) | ^~~~~~~~ include/linux/byteorder/generic.h:96:21: note: in expansion of macro '__cpu_to_be16' 96 | #define cpu_to_be16 __cpu_to_be16 | ^~~~~~~~~~~~~ drivers/net/ethernet/intel/ice/ice_tc_lib.c:1458:5: note: in expansion of macro 'cpu_to_be16' 1458 | cpu_to_be16((match.key->vlan_priority << | ^~~~~~~~~~~ After a change to be16_encode_bits(), the code becomes more readable to both people and compilers, which avoids the warning. Fixes: 34800178b302 ("ice: Add support for VLAN priority filters in switchdev") Suggested-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * | ice: don't ignore return codes in VSI related codeMichal Swiatkowski2023-03-071-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There were few smatch warnings reported by Dan: - ice_vsi_cfg_xdp_txqs can return 0 instead of ret, which is cleaner - return values in ice_vsi_cfg_def were ignored - in ice_vsi_rebuild return value was ignored in case rebuild failed, it was a never reached code, however, rewrite it for clarity. - ice_vsi_cfg_tc can return 0 instead of ret Fixes: 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * | ice: Fix DSCP PFC TLV creationDave Ertman2023-03-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When creating the TLV to send to the FW for configuring DSCP mode PFC,the PFCENABLE field was being masked with a 4 bit mask (0xF), but this is an 8 bit bitmask for enabled classes for PFC. This means that traffic classes 4-7 could not be enabled for PFC. Remove the mask completely, as it is not necessary, as we are assigning 8 bits to an 8 bit field. Fixes: 2a87bd73e50d ("ice: Add DSCP support") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Karen Ostrowska <karen.ostrowska@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | | Merge branch 'tools-ynl-fix-enum-as-flags-in-the-generic-cli'Jakub Kicinski2023-03-084-96/+126
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jakub Kicinski says: ==================== tools: ynl: fix enum-as-flags in the generic CLI The CLI needs to use proper classes when looking at Enum definitions rather than interpreting the YAML spec ad-hoc, because we have more than on format of the definition supported. ==================== Link: https://lore.kernel.org/r/20230308003923.445268-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * | | tools: ynl: fix enum-as-flags in the generic CLIJakub Kicinski2023-03-082-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lorenzo points out that the generic CLI is broken for the netdev family. When I added the support for documentation of enums (and sparse enums) the client script was not updated. It expects the values in enum to be a list of names, now it can also be a dict (YAML object). Reported-by: Lorenzo Bianconi <lorenzo@kernel.org> Fixes: e4b48ed460d3 ("tools: ynl: add a completely generic client") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * | | tools: ynl: move the enum classes to shared codeJakub Kicinski2023-03-083-89/+121
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move bulk of the EnumSet and EnumEntry code to shared code for reuse by cli. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | net: avoid double iput when sock_alloc_file failsThadeu Lima de Souza Cascardo2023-03-081-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When sock_alloc_file fails to allocate a file, it will call sock_release. __sys_socket_file should then not call sock_release again, otherwise there will be a double free. [ 89.319884] ------------[ cut here ]------------ [ 89.320286] kernel BUG at fs/inode.c:1764! [ 89.320656] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 89.321051] CPU: 7 PID: 125 Comm: iou-sqp-124 Not tainted 6.2.0+ #361 [ 89.321535] RIP: 0010:iput+0x1ff/0x240 [ 89.321808] Code: d1 83 e1 03 48 83 f9 02 75 09 48 81 fa 00 10 00 00 77 05 83 e2 01 75 1f 4c 89 ef e8 fb d2 ba 00 e9 80 fe ff ff c3 cc cc cc cc <0f> 0b 0f 0b e9 d0 fe ff ff 0f 0b eb 8d 49 8d b4 24 08 01 00 00 48 [ 89.322760] RSP: 0018:ffffbdd60068bd50 EFLAGS: 00010202 [ 89.323036] RAX: 0000000000000000 RBX: ffff9d7ad3cacac0 RCX: 0000000000001107 [ 89.323412] RDX: 000000000003af00 RSI: 0000000000000000 RDI: ffff9d7ad3cacb40 [ 89.323785] RBP: ffffbdd60068bd68 R08: ffffffffffffffff R09: ffffffffab606438 [ 89.324157] R10: ffffffffacb3dfa0 R11: 6465686361657256 R12: ffff9d7ad3cacb40 [ 89.324529] R13: 0000000080000001 R14: 0000000080000001 R15: 0000000000000002 [ 89.324904] FS: 00007f7b28516740(0000) GS:ffff9d7aeb1c0000(0000) knlGS:0000000000000000 [ 89.325328] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 89.325629] CR2: 00007f0af52e96c0 CR3: 0000000002a02006 CR4: 0000000000770ee0 [ 89.326004] PKRU: 55555554 [ 89.326161] Call Trace: [ 89.326298] <TASK> [ 89.326419] __sock_release+0xb5/0xc0 [ 89.326632] __sys_socket_file+0xb2/0xd0 [ 89.326844] io_socket+0x88/0x100 [ 89.327039] ? io_issue_sqe+0x6a/0x430 [ 89.327258] io_issue_sqe+0x67/0x430 [ 89.327450] io_submit_sqes+0x1fe/0x670 [ 89.327661] io_sq_thread+0x2e6/0x530 [ 89.327859] ? __pfx_autoremove_wake_function+0x10/0x10 [ 89.328145] ? __pfx_io_sq_thread+0x10/0x10 [ 89.328367] ret_from_fork+0x29/0x50 [ 89.328576] RIP: 0033:0x0 [ 89.328732] Code: Unable to access opcode bytes at 0xffffffffffffffd6. [ 89.329073] RSP: 002b:0000000000000000 EFLAGS: 00000202 ORIG_RAX: 00000000000001a9 [ 89.329477] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f7b28637a3d [ 89.329845] RDX: 00007fff4e4318a8 RSI: 00007fff4e4318b0 RDI: 0000000000000400 [ 89.330216] RBP: 00007fff4e431830 R08: 00007fff4e431711 R09: 00007fff4e4318b0 [ 89.330584] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff4e441b38 [ 89.330950] R13: 0000563835e3e725 R14: 0000563835e40d10 R15: 00007f7b28784040 [ 89.331318] </TASK> [ 89.331441] Modules linked in: [ 89.331617] ---[ end trace 0000000000000000 ]--- Fixes: da214a475f8b ("net: add __sys_socket_file()") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230307173707.468744-1-cascardo@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | af_unix: fix struct pid leaks in OOB supportEric Dumazet2023-03-081-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzbot reported struct pid leak [1]. Issue is that queue_oob() calls maybe_add_creds() which potentially holds a reference on a pid. But skb->destructor is not set (either directly or by calling unix_scm_to_skb()) This means that subsequent kfree_skb() or consume_skb() would leak this reference. In this fix, I chose to fully support scm even for the OOB message. [1] BUG: memory leak unreferenced object 0xffff8881053e7f80 (size 128): comm "syz-executor242", pid 5066, jiffies 4294946079 (age 13.220s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff812ae26a>] alloc_pid+0x6a/0x560 kernel/pid.c:180 [<ffffffff812718df>] copy_process+0x169f/0x26c0 kernel/fork.c:2285 [<ffffffff81272b37>] kernel_clone+0xf7/0x610 kernel/fork.c:2684 [<ffffffff812730cc>] __do_sys_clone+0x7c/0xb0 kernel/fork.c:2825 [<ffffffff849ad699>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<ffffffff849ad699>] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 [<ffffffff84a0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 314001f0bf92 ("af_unix: Add OOB support") Reported-by: syzbot+7699d9e5635c10253a27@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Rao Shoaib <rao.shoaib@oracle.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230307164530.771896-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | eth: fealnx: bring back this old driverJakub Kicinski2023-03-085-0/+1966
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit d5e2d038dbece821f1af57acbeded3aa9a1832c1. We have a report of this chip being used on a SURECOM EP-320X-S 100/10M Ethernet PCI Adapter which could still have been purchased in some parts of the world 3 years ago. Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=217151 Fixes: d5e2d038dbec ("eth: fealnx: delete the driver for Myson MTD-800") Link: https://lore.kernel.org/r/20230307171930.4008454-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | net: dsa: mt7530: permit port 5 to work without port 6 on MT7621 SoCVladimir Oltean2023-03-081-15/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The MT7530 switch from the MT7621 SoC has 2 ports which can be set up as internal: port 5 and 6. Arınç reports that the GMAC1 attached to port 5 receives corrupted frames, unless port 6 (attached to GMAC0) has been brought up by the driver. This is true regardless of whether port 5 is used as a user port or as a CPU port (carrying DSA tags). Offline debugging (blind for me) which began in the linked thread showed experimentally that the configuration done by the driver for port 6 contains a step which is needed by port 5 as well - the write to CORE_GSWPLL_GRP2 (note that I've no idea as to what it does, apart from the comment "Set core clock into 500Mhz"). Prints put by Arınç show that the reset value of CORE_GSWPLL_GRP2 is RG_GSWPLL_POSDIV_500M(1) | RG_GSWPLL_FBKDIV_500M(40) (0x128), both on the MCM MT7530 from the MT7621 SoC, as well as on the standalone MT7530 from MT7623NI Bananapi BPI-R2. Apparently, port 5 on the standalone MT7530 can work under both values of the register, while on the MT7621 SoC it cannot. The call path that triggers the register write is: mt753x_phylink_mac_config() for port 6 -> mt753x_pad_setup() -> mt7530_pad_clk_setup() so this fully explains the behavior noticed by Arınç, that bringing port 6 up is necessary. The simplest fix for the problem is to extract the register writes which are needed for both port 5 and 6 into a common mt7530_pll_setup() function, which is called at mt7530_setup() time, immediately after switch reset. We can argue that this mirrors the code layout introduced in mt7531_setup() by commit 42bc4fafe359 ("net: mt7531: only do PLL once after the reset"), in that the PLL setup has the exact same positioning, and further work to consolidate the separate setup() functions is not hindered. Testing confirms that: - the slight reordering of writes to MT7530_P6ECR and to CORE_GSWPLL_GRP1 / CORE_GSWPLL_GRP2 introduced by this change does not appear to cause problems for the operation of port 6 on MT7621 and on MT7623 (where port 5 also always worked) - packets sent through port 5 are not corrupted anymore, regardless of whether port 6 is enabled by phylink or not (or even present in the device tree) My algorithm for determining the Fixes: tag is as follows. Testing shows that some logic from mt7530_pad_clk_setup() is needed even for port 5. Prior to commit ca366d6c889b ("net: dsa: mt7530: Convert to PHYLINK API"), a call did exist for all phy_is_pseudo_fixed_link() ports - so port 5 included. That commit replaced it with a temporary "Port 5 is not supported!" comment, and the following commit 38f790a80560 ("net: dsa: mt7530: Add support for port 5") replaced that comment with a configuration procedure in mt7530_setup_port5() which was insufficient for port 5 to work. I'm laying the blame on the patch that claimed support for port 5, although one would have also needed the change from commit c3b8e07909db ("net: dsa: mt7530: setup core clock even in TRGMII mode") for the write to be performed completely independently from port 6's configuration. Thanks go to Arınç for describing the problem, for debugging and for testing. Reported-by: Arınç ÜNAL <arinc.unal@arinc9.com> Link: https://lore.kernel.org/netdev/f297c2c4-6e7c-57ac-2394-f6025d309b9d@arinc9.com/ Fixes: 38f790a80560 ("net: dsa: mt7530: Add support for port 5") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230307155411.868573-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | net: microchip: sparx5: fix deletion of existing DSCP mappingsDaniel Machon2023-03-081-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix deletion of existing DSCP mappings in the APP table. Adding and deleting DSCP entries are replicated per-port, since the mapping table is global for all ports in the chip. Whenever a mapping for a DSCP value already exists, the old mapping is deleted first. However, it is only deleted for the specified port. Fix this by calling sparx5_dcb_ieee_delapp() instead of dcb_ieee_delapp() as it ought to be. Reproduce: // Map and remap DSCP value 63 $ dcb app add dev eth0 dscp-prio 63:1 $ dcb app add dev eth0 dscp-prio 63:2 $ dcb app show dev eth0 dscp-prio dscp-prio 63:2 $ dcb app show dev eth1 dscp-prio dscp-prio 63:1 63:2 <-- 63:1 should not be there Fixes: 8dcf69a64118 ("net: microchip: sparx5: add support for offloading dscp table") Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | octeontx2-af: Unlock contexts in the queue context cache in case of fault ↵Suman Ghosh2023-03-085-7/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | detection NDC caches contexts of frequently used queue's (Rx and Tx queues) contexts. Due to a HW errata when NDC detects fault/poision while accessing contexts it could go into an illegal state where a cache line could get locked forever. To makesure all cache lines in NDC are available for optimum performance upon fault/lockerror/posion errors scan through all cache lines in NDC and clear the lock bit. Fixes: 4a3581cd5995 ("octeontx2-af: NPA AQ instruction enqueue support") Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net/smc: fix fallback failed while sendmsg with fastopenD. Wythe2023-03-081-5/+8
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before determining whether the msg has unsupported options, it has been prematurely terminated by the wrong status check. For the application, the general usages of MSG_FASTOPEN likes fd = socket(...) /* rather than connect */ sendto(fd, data, len, MSG_FASTOPEN) Hence, We need to check the flag before state check, because the sock state here is always SMC_INIT when applications tries MSG_FASTOPEN. Once we found unsupported options, fallback it to TCP. Fixes: ee9dfbef02d1 ("net/smc: handle sockopts forcing fallback") Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> v2 -> v1: Optimize code style Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ynl: re-license uniformly under GPL-2.0 OR BSD-3-ClauseJakub Kicinski2023-03-0720-18/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I was intending to make all the Netlink Spec code BSD-3-Clause to ease the adoption but it appears that: - I fumbled the uAPI and used "GPL WITH uAPI note" there - it gives people pause as they expect GPL in the kernel As suggested by Chuck re-license under dual. This gives us benefit of full BSD freedom while fulfilling the broad "kernel is under GPL" expectations. Link: https://lore.kernel.org/all/20230304120108.05dd44c5@kernel.org/ Link: https://lore.kernel.org/r/20230306200457.3903854-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | mailmap: update entries for Stephen HemmingerStephen Hemminger2023-03-071-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | Map all my old email addresses to current address. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Link: https://lore.kernel.org/r/20230306194405.108236-1-stephen@networkplumber.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | mailmap: add entry for Maxim MikityanskiyJakub Kicinski2023-03-071-0/+2
| | | | | | | | | | | | | | | | | | | | | Map Maxim's old corporate addresses to his personal one. Link: https://lore.kernel.org/r/20230306192018.3894988-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | nfc: change order inside nfc_se_io error pathFedor Pchelkin2023-03-071-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | cb_context should be freed on the error path in nfc_se_io as stated by commit 25ff6f8a5a3b ("nfc: fix memory leak of se_io context in nfc_genl_se_io"). Make the error path in nfc_se_io unwind everything in reverse order, i.e. free the cb_context after unlocking the device. Suggested-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230306212650.230322-1-pchelkin@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * net: usb: qmi_wwan: add Telit 0x1080 compositionEnrico Sau2023-03-071-0/+1
| | | | | | | | | | | | | | | | | | | | Add the following Telit FE990 composition: 0x1080: tty, adb, rmnet, tty, tty, tty, tty Signed-off-by: Enrico Sau <enrico.sau@gmail.com> Link: https://lore.kernel.org/r/20230306120528.198842-1-enrico.sau@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * net: usb: cdc_mbim: avoid altsetting toggling for Telit FE990Enrico Sau2023-03-071-0/+5
| | | | | | | | | | | | | | | | | | Add quirk CDC_MBIM_FLAG_AVOID_ALTSETTING_TOGGLE for Telit FE990 0x1081 composition in order to avoid bind error. Signed-off-by: Enrico Sau <enrico.sau@gmail.com> Link: https://lore.kernel.org/r/20230306115933.198259-1-enrico.sau@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * Merge branch 'main' of ↵Paolo Abeni2023-03-075-11/+18
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Restore ctnetlink zero mark in events and dump, from Ivan Delalande. 2) Fix deadlock due to missing disabled bh in tproxy, from Florian Westphal. 3) Safer maximum chain load in conntrack, from Eric Dumazet. * 'main' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: conntrack: adopt safer max chain length netfilter: tproxy: fix deadlock due to missing BH disable netfilter: ctnetlink: revert to dumping mark regardless of event type ==================== Link: https://lore.kernel.org/r/20230307100424.2037-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| | * netfilter: conntrack: adopt safer max chain lengthEric Dumazet2023-03-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Customers using GKE 1.25 and 1.26 are facing conntrack issues root caused to commit c9c3b6811f74 ("netfilter: conntrack: make max chain length random"). Even if we assume Uniform Hashing, a bucket often reachs 8 chained items while the load factor of the hash table is smaller than 0.5 With a limit of 16, we reach load factors of 3. With a limit of 32, we reach load factors of 11. With a limit of 40, we reach load factors of 15. With a limit of 50, we reach load factors of 24. This patch changes MIN_CHAINLEN to 50, to minimize risks. Ideally, we could in the future add a cushion based on expected load factor (2 * nf_conntrack_max / nf_conntrack_buckets), because some setups might expect unusual values. Fixes: c9c3b6811f74 ("netfilter: conntrack: make max chain length random") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: tproxy: fix deadlock due to missing BH disableFlorian Westphal2023-03-063-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The xtables packet traverser performs an unconditional local_bh_disable(), but the nf_tables evaluation loop does not. Functions that are called from either xtables or nftables must assume that they can be called in process context. inet_twsk_deschedule_put() assumes that no softirq interrupt can occur. If tproxy is used from nf_tables its possible that we'll deadlock trying to aquire a lock already held in process context. Add a small helper that takes care of this and use it. Link: https://lore.kernel.org/netfilter-devel/401bd6ed-314a-a196-1cdc-e13c720cc8f2@balasys.hu/ Fixes: 4ed8eb6570a4 ("netfilter: nf_tables: Add native tproxy support") Reported-and-tested-by: Major Dávid <major.david@balasys.hu> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: ctnetlink: revert to dumping mark regardless of event typeIvan Delalande2023-03-061-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems that change was unintentional, we have userspace code that needs the mark while listening for events like REPLY, DESTROY, etc. Also include 0-marks in requested dumps, as they were before that fix. Fixes: 1feeae071507 ("netfilter: ctnetlink: fix compilation warning after data race fixes in ct mark") Signed-off-by: Ivan Delalande <colona@arista.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | Merge tag 'for-netdev' of ↵Jakub Kicinski2023-03-069-18/+64
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2023-03-06 We've added 8 non-merge commits during the last 7 day(s) which contain a total of 9 files changed, 64 insertions(+), 18 deletions(-). The main changes are: 1) Fix BTF resolver for DATASEC sections when a VAR points at a modifier, that is, keep resolving such instances instead of bailing out, from Lorenz Bauer. 2) Fix BPF test framework with regards to xdp_frame info misplacement in the "live packet" code, from Alexander Lobakin. 3) Fix an infinite loop in BPF sockmap code for TCP/UDP/AF_UNIX, from Liu Jian. 4) Fix a build error for riscv BPF JIT under PERF_EVENTS=n, from Randy Dunlap. 5) Several BPF doc fixes with either broken links or external instead of internal doc links, from Bagas Sanjaya. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: check that modifier resolves after pointer btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES bpf, doc: Link to submitting-patches.rst for general patch submission info bpf, doc: Do not link to docs.kernel.org for kselftest link bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser() riscv, bpf: Fix patch_text implicit declaration bpf, docs: Fix link to BTF doc ==================== Link: https://lore.kernel.org/r/20230306215944.11981-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * \ Merge branch 'fix resolving VAR after DATASEC'Martin KaFai Lau2023-03-062-0/+29
| | |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lorenz Bauer says: ==================== See the first patch for a detailed explanation. v2: - Move RESOLVE_TBD assignment out of the loop (Martin) ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
| | | * | selftests/bpf: check that modifier resolves after pointerLorenz Bauer2023-03-061-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a regression test that ensures that a VAR pointing at a modifier which follows a PTR (or STRUCT or ARRAY) is resolved correctly by the datasec validator. Signed-off-by: Lorenz Bauer <lmb@isovalent.com> Link: https://lore.kernel.org/r/20230306112138.155352-3-lmb@isovalent.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
| | | * | btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTRLorenz Bauer2023-03-061-0/+1
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btf_datasec_resolve contains a bug that causes the following BTF to fail loading: [1] DATASEC a size=2 vlen=2 type_id=4 offset=0 size=1 type_id=7 offset=1 size=1 [2] INT (anon) size=1 bits_offset=0 nr_bits=8 encoding=(none) [3] PTR (anon) type_id=2 [4] VAR a type_id=3 linkage=0 [5] INT (anon) size=1 bits_offset=0 nr_bits=8 encoding=(none) [6] TYPEDEF td type_id=5 [7] VAR b type_id=6 linkage=0 This error message is printed during btf_check_all_types: [1] DATASEC a size=2 vlen=2 type_id=7 offset=1 size=1 Invalid type By tracing btf_*_resolve we can pinpoint the problem: btf_datasec_resolve(depth: 1, type_id: 1, mode: RESOLVE_TBD) = 0 btf_var_resolve(depth: 2, type_id: 4, mode: RESOLVE_TBD) = 0 btf_ptr_resolve(depth: 3, type_id: 3, mode: RESOLVE_PTR) = 0 btf_var_resolve(depth: 2, type_id: 4, mode: RESOLVE_PTR) = 0 btf_datasec_resolve(depth: 1, type_id: 1, mode: RESOLVE_PTR) = -22 The last invocation of btf_datasec_resolve should invoke btf_var_resolve by means of env_stack_push, instead it returns EINVAL. The reason is that env_stack_push is never executed for the second VAR. if (!env_type_is_resolve_sink(env, var_type) && !env_type_is_resolved(env, var_type_id)) { env_stack_set_next_member(env, i + 1); return env_stack_push(env, var_type, var_type_id); } env_type_is_resolve_sink() changes its behaviour based on resolve_mode. For RESOLVE_PTR, we can simplify the if condition to the following: (btf_type_is_modifier() || btf_type_is_ptr) && !env_type_is_resolved() Since we're dealing with a VAR the clause evaluates to false. This is not sufficient to trigger the bug however. The log output and EINVAL are only generated if btf_type_id_size() fails. if (!btf_type_id_size(btf, &type_id, &type_size)) { btf_verifier_log_vsi(env, v->t, vsi, "Invalid type"); return -EINVAL; } Most types are sized, so for example a VAR referring to an INT is not a problem. The bug is only triggered if a VAR points at a modifier. Since we skipped btf_var_resolve that modifier was also never resolved, which means that btf_resolved_type_id returns 0 aka VOID for the modifier. This in turn causes btf_type_id_size to return NULL, triggering EINVAL. To summarise, the following conditions are necessary: - VAR pointing at PTR, STRUCT, UNION or ARRAY - Followed by a VAR pointing at TYPEDEF, VOLATILE, CONST, RESTRICT or TYPE_TAG The fix is to reset resolve_mode to RESOLVE_TBD before attempting to resolve a VAR from a DATASEC. Fixes: 1dc92851849c ("bpf: kernel side support for BTF Var and DataSec") Signed-off-by: Lorenz Bauer <lmb@isovalent.com> Link: https://lore.kernel.org/r/20230306112138.155352-2-lmb@isovalent.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
| | * | bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMESAlexander Lobakin2023-03-062-9/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | &xdp_buff and &xdp_frame are bound in a way that xdp_buff->data_hard_start == xdp_frame It's always the case and e.g. xdp_convert_buff_to_frame() relies on this. IOW, the following: for (u32 i = 0; i < 0xdead; i++) { xdpf = xdp_convert_buff_to_frame(&xdp); xdp_convert_frame_to_buff(xdpf, &xdp); } shouldn't ever modify @xdpf's contents or the pointer itself. However, "live packet" code wrongly treats &xdp_frame as part of its context placed *before* the data_hard_start. With such flow, data_hard_start is sizeof(*xdpf) off to the right and no longer points to the XDP frame. Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several places and praying that there are no more miscalcs left somewhere in the code, unionize ::frm with ::data in a flex array, so that both starts pointing to the actual data_hard_start and the XDP frame actually starts being a part of it, i.e. a part of the headroom, not the context. A nice side effect is that the maximum frame size for this mode gets increased by 40 bytes, as xdp_buff::frame_sz includes everything from data_hard_start (-> includes xdpf already) to the end of XDP/skb shared info. Also update %MAX_PKT_SIZE accordingly in the selftests code. Leave it hardcoded for 64 bit && 4k pages, it can be made more flexible later on. Minor: align `&head->data` with how `head->frm` is assigned for consistency. Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for clarity. (was found while testing XDP traffic generator on ice, which calls xdp_convert_frame_to_buff() for each XDP frame) Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN") Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20230224163607.2994755-1-aleksander.lobakin@intel.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
| | * | bpf, doc: Link to submitting-patches.rst for general patch submission infoBagas Sanjaya2023-03-061-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The link for patch submission information in general refers to index page for "Working with the kernel development community" section of kernel docs, whereas the link should have been Documentation/process/submitting-patches.rst instead. Fix it by replacing the index target with the appropriate doc. Fixes: 542228384888f5 ("bpf, doc: convert bpf_devel_QA.rst to use RST formatting") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230228074523.11493-3-bagasdotme@gmail.com
| | * | bpf, doc: Do not link to docs.kernel.org for kselftest linkBagas Sanjaya2023-03-061-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The question on how to run BPF selftests have a reference link to kernel selftest documentation (Documentation/dev-tools/kselftest.rst). However, it uses external link to the documentation at kernel.org/docs (aka docs.kernel.org) instead, which requires Internet access. Fix this and replace the link with internal linking, by using :doc: directive while keeping the anchor text. Fixes: b7a27c3aafa252 ("bpf, doc: howto use/run the BPF selftests") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230228074523.11493-2-bagasdotme@gmail.com
| | * | bpf, sockmap: Fix an infinite loop error when len is 0 in ↵Liu Jian2023-03-033-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tcp_bpf_recvmsg_parser() When the buffer length of the recvmsg system call is 0, we got the flollowing soft lockup problem: watchdog: BUG: soft lockup - CPU#3 stuck for 27s! [a.out:6149] CPU: 3 PID: 6149 Comm: a.out Kdump: loaded Not tainted 6.2.0+ #30 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 RIP: 0010:remove_wait_queue+0xb/0xc0 Code: 5e 41 5f c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 57 <41> 56 41 55 41 54 55 48 89 fd 53 48 89 f3 4c 8d 6b 18 4c 8d 73 20 RSP: 0018:ffff88811b5978b8 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff88811a7d3780 RCX: ffffffffb7a4d768 RDX: dffffc0000000000 RSI: ffff88811b597908 RDI: ffff888115408040 RBP: 1ffff110236b2f1b R08: 0000000000000000 R09: ffff88811a7d37e7 R10: ffffed10234fa6fc R11: 0000000000000001 R12: ffff88811179b800 R13: 0000000000000001 R14: ffff88811a7d38a8 R15: ffff88811a7d37e0 FS: 00007f6fb5398740(0000) GS:ffff888237180000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000000 CR3: 000000010b6ba002 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_msg_wait_data+0x279/0x2f0 tcp_bpf_recvmsg_parser+0x3c6/0x490 inet_recvmsg+0x280/0x290 sock_recvmsg+0xfc/0x120 ____sys_recvmsg+0x160/0x3d0 ___sys_recvmsg+0xf0/0x180 __sys_recvmsg+0xea/0x1a0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc The logic in tcp_bpf_recvmsg_parser is as follows: msg_bytes_ready: copied = sk_msg_recvmsg(sk, psock, msg, len, flags); if (!copied) { wait data; goto msg_bytes_ready; } In this case, "copied" always is 0, the infinite loop occurs. According to the Linux system call man page, 0 should be returned in this case. Therefore, in tcp_bpf_recvmsg_parser(), if the length is 0, directly return. Also modify several other functions with the same problem. Fixes: 1f5be6b3b063 ("udp: Implement udp_bpf_recvmsg() for sockmap") Fixes: 9825d866ce0d ("af_unix: Implement unix_dgram_bpf_recvmsg()") Fixes: c5d2177a72a1 ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self") Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Cc: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20230303080946.1146638-1-liujian56@huawei.com
| | * | riscv, bpf: Fix patch_text implicit declarationRandy Dunlap2023-02-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bpf_jit_comp64.c uses patch_text(), so add <asm/patch.h> to it to prevent the build error: ../arch/riscv/net/bpf_jit_comp64.c: In function 'bpf_arch_text_poke': ../arch/riscv/net/bpf_jit_comp64.c:691:23: error: implicit declaration of function 'patch_text'; did you mean 'path_get'? [-Werror=implicit-function-declaration] 691 | ret = patch_text(ip, new_insns, ninsns); | ^~~~~~~~~~ Fixes: 596f2e6f9cf4 ("riscv, bpf: Add bpf_arch_text_poke support for RV64") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Pu Lehui <pulehui@huawei.com> Acked-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/202302271000.Aj4nMXbZ-lkp@intel.com Link: https://lore.kernel.org/bpf/20230227072016.14618-1-rdunlap@infradead.org
| | * | bpf, docs: Fix link to BTF docBagas Sanjaya2023-02-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ross reported broken link to BTF documentation (Documentation/bpf/btf.rst) in Documentation/bpf/bpf_devel_QA.rst. The link in question is written using external link syntax, with the target refers to BTF doc in reST source (btf.rst), which doesn't exist in resulting HTML output. Fix the link by replacing external link syntax with simply writing out the target doc, which the link will be generated to the correct HTML doc target. Fixes: 6736aa793c2b5f ("selftests/bpf: Add general instructions for test execution") Reported-by: Ross Zwisler <zwisler@google.com> Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Ross Zwisler <zwisler@google.com> Link: https://lore.kernel.org/linux-doc/Y++09LKx25dtR4Ow@google.com/ Link: https://lore.kernel.org/bpf/20230222083530.26136-1-bagasdotme@gmail.com
| * | | net: tls: fix device-offloaded sendpage straddling recordsJakub Kicinski2023-03-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adrien reports that incorrect data is transmitted when a single page straddles multiple records. We would transmit the same data in all iterations of the loop. Reported-by: Adrien Moulin <amoulin@corp.free.fr> Link: https://lore.kernel.org/all/61481278.42813558.1677845235112.JavaMail.zimbra@corp.free.fr Fixes: c1318b39c7d3 ("tls: Add opt-in zerocopy mode of sendfile()") Tested-by: Adrien Moulin <amoulin@corp.free.fr> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com> Link: https://lore.kernel.org/r/20230304192610.3818098-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>