summaryrefslogtreecommitdiffstats
path: root/drivers/net/virtio_net.c
Commit message (Collapse)AuthorAgeFilesLines
* virtio_net: Introduce skb_vnet_common_hdr to avoid typecastingFeng Liu2023-08-231-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The virtio_net driver currently deals with different versions and types of virtio net headers, such as virtio_net_hdr_mrg_rxbuf, virtio_net_hdr_v1_hash, etc. Due to these variations, the code relies on multiple type casts to convert memory between different structures, potentially leading to bugs when there are changes in these structures. Introduces the "struct skb_vnet_common_hdr" as a unifying header structure using a union. With this approach, various virtio net header structures can be converted by accessing different members of this structure, thus eliminating the need for type casting and reducing the risk of potential bugs. For example following code: static struct sk_buff *page_to_skb(struct virtnet_info *vi, struct receive_queue *rq, struct page *page, unsigned int offset, unsigned int len, unsigned int truesize, unsigned int headroom) { [...] struct virtio_net_hdr_mrg_rxbuf *hdr; [...] hdr_len = vi->hdr_len; [...] ok: hdr = skb_vnet_hdr(skb); memcpy(hdr, hdr_p, hdr_len); [...] } When VIRTIO_NET_F_HASH_REPORT feature is enabled, hdr_len = 20 But the sizeof(*hdr) is 12, memcpy(hdr, hdr_p, hdr_len); will copy 20 bytes to the hdr, which make a potential risk of bug. And this risk can be avoided by introducing struct skb_vnet_common_hdr. Change log v1->v2 feedback from Willem de Bruijn <willemdebruijn.kernel@gmail.com> feedback from Simon Horman <horms@kernel.org> 1. change to use net-next tree. 2. move skb_vnet_common_hdr inside kernel file instead of the UAPI header. v2->v3 feedback from Willem de Bruijn <willemdebruijn.kernel@gmail.com> 1. fix typo in commit message. 2. add original struct virtio_net_hdr into union 3. remove virtio_net_hdr_mrg_rxbuf variable in receive_buf; Signed-off-by: Feng Liu <feliu@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2023-08-181-3/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/sfc/tc.c fa165e194997 ("sfc: don't unregister flow_indr if it was never registered") 3bf969e88ada ("sfc: add MAE table machinery for conntrack table") https://lore.kernel.org/all/20230818112159.7430e9b4@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * Merge tag 'net-6.5-rc7' of ↵Linus Torvalds2023-08-181-2/+2
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from ipsec and netfilter. No known outstanding regressions. Fixes to fixes: - virtio-net: set queues after driver_ok, avoid a potential race added by recent fix - Revert "vlan: Fix VLAN 0 memory leak", it may lead to a warning when VLAN 0 is registered explicitly - nf_tables: - fix false-positive lockdep splat in recent fixes - don't fail inserts if duplicate has expired (fix test failures) - fix races between garbage collection and netns dismantle Current release - new code bugs: - mlx5: Fix mlx5_cmd_update_root_ft() error flow Previous releases - regressions: - phy: fix IRQ-based wake-on-lan over hibernate / power off Previous releases - always broken: - sock: fix misuse of sk_under_memory_pressure() preventing system from exiting global TCP memory pressure if a single cgroup is under pressure - fix the RTO timer retransmitting skb every 1ms if linear option is enabled - af_key: fix sadb_x_filter validation, amment netlink policy - ipsec: fix slab-use-after-free in decode_session6() - macb: in ZynqMP resume always configure PS GTR for non-wakeup source Misc: - netfilter: set default timeout to 3 secs for sctp shutdown send and recv state (from 300ms), align with protocol timers" * tag 'net-6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits) ice: Block switchdev mode when ADQ is active and vice versa qede: fix firmware halt over suspend and resume net: do not allow gso_size to be set to GSO_BY_FRAGS sock: Fix misuse of sk_under_memory_pressure() sfc: don't fail probe if MAE/TC setup fails sfc: don't unregister flow_indr if it was never registered net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset net/mlx5: Fix mlx5_cmd_update_root_ft() error flow net/mlx5e: XDP, Fix fifo overrun on XDP_REDIRECT i40e: fix misleading debug logs iavf: fix FDIR rule fields masks validation ipv6: fix indentation of a config attribute mailmap: add entries for Simon Horman broadcom: b44: Use b44_writephy() return value net: openvswitch: reject negative ifindex team: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves net: phy: broadcom: stub c45 read/write for 54810 netfilter: nft_dynset: disallow object maps netfilter: nf_tables: GC transaction race with netns dismantle netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path ...
| | * virtio-net: set queues after driver_okJason Wang2023-08-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready(). Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way. Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea <dtatulea@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | virtio-net: Zero max_tx_vq field for VIRTIO_NET_CTRL_MQ_HASH_CONFIG caseHawkins Jiawei2023-08-101-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kernel uses `struct virtio_net_ctrl_rss` to save command-specific-data for both the VIRTIO_NET_CTRL_MQ_HASH_CONFIG and VIRTIO_NET_CTRL_MQ_RSS_CONFIG commands. According to the VirtIO standard, "Field reserved MUST contain zeroes. It is defined to make the structure to match the layout of virtio_net_rss_config structure, defined in 5.1.6.5.7.". Yet for the VIRTIO_NET_CTRL_MQ_HASH_CONFIG command case, the `max_tx_vq` field in struct virtio_net_ctrl_rss, which corresponds to the `reserved` field in struct virtio_net_hash_config, is not zeroed, thereby violating the VirtIO standard. This patch solves this problem by zeroing this field in virtnet_init_default_rss(). Cc: Andrew Melnychenko <andrew@daynix.com> Cc: stable@vger.kernel.org Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.") Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20230810110405.25558-1-yin31149@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com>
* | Merge tag 'for-netdev' of ↵Jakub Kicinski2023-08-031-0/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2023-08-03 We've added 54 non-merge commits during the last 10 day(s) which contain a total of 84 files changed, 4026 insertions(+), 562 deletions(-). The main changes are: 1) Add SO_REUSEPORT support for TC bpf_sk_assign from Lorenz Bauer, Daniel Borkmann 2) Support new insns from cpu v4 from Yonghong Song 3) Non-atomically allocate freelist during prefill from YiFei Zhu 4) Support defragmenting IPv(4|6) packets in BPF from Daniel Xu 5) Add tracepoint to xdp attaching failure from Leon Hwang 6) struct netdev_rx_queue and xdp.h reshuffling to reduce rebuild time from Jakub Kicinski * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits) net: invert the netdevice.h vs xdp.h dependency net: move struct netdev_rx_queue out of netdevice.h eth: add missing xdp.h includes in drivers selftests/bpf: Add testcase for xdp attaching failure tracepoint bpf, xdp: Add tracepoint to xdp attaching failure selftests/bpf: fix static assert compilation issue for test_cls_*.c bpf: fix bpf_probe_read_kernel prototype mismatch riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace framework libbpf: fix typos in Makefile tracing: bpf: use struct trace_entry in struct syscall_tp_t bpf, devmap: Remove unused dtab field from bpf_dtab_netdev bpf, cpumap: Remove unused cmap field from bpf_cpu_map_entry netfilter: bpf: Only define get_proto_defrag_hook() if necessary bpf: Fix an array-index-out-of-bounds issue in disasm.c net: remove duplicate INDIRECT_CALLABLE_DECLARE of udp[6]_ehashfn docs/bpf: Fix malformed documentation bpf: selftests: Add defrag selftests bpf: selftests: Support custom type and proto for client sockets bpf: selftests: Support not connecting client socket netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link ... ==================== Link: https://lore.kernel.org/r/20230803174845.825419-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | net: move struct netdev_rx_queue out of netdevice.hJakub Kicinski2023-08-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct netdev_rx_queue is touched in only a few places and having it defined in netdevice.h brings in the dependency on xdp.h, because struct xdp_rxq_info gets embedded in struct netdev_rx_queue. In prep for removal of xdp.h from netdevice.h move all the netdev_rx_queue stuff to a new header. We could technically break the new header up to avoid the sysfs.h include but it's so rarely included it doesn't seem to be worth it at this point. Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20230803010230.1755386-3-kuba@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
* | | virtio_net: enable per queue interrupt coalesce featureGavin Li2023-08-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable per queue interrupt coalesce feature bit in driver and validate its dependency with control queue. Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20230731070656.96411-4-gavinl@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | | virtio_net: support per queue interrupt coalesce commandGavin Li2023-08-011-8/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add interrupt_coalesce config in send_queue and receive_queue to cache user config. Send per virtqueue interrupt moderation config to underlying device in order to have more efficient interrupt moderation and cpu utilization of guest VM. Additionally, address all the VQs when updating the global configuration, as now the individual VQs configuration can diverge from the global configuration. Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20230731070656.96411-3-gavinl@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | | virtio_net: extract interrupt coalescing settings to a structureGavin Li2023-08-011-16/+19
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | Extract interrupt coalescing settings to a structure so that it could be reused in other data structures. Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Link: https://lore.kernel.org/r/20230731070656.96411-2-gavinl@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio-net: fix race between set queues and probeJason Wang2023-07-261-2/+2
|/ | | | | | | | | | | | | | | | A race were found where set_channels could be called after registering but before virtnet_set_queues() in virtnet_probe(). Fixing this by moving the virtnet_set_queues() before netdevice registering. While at it, use _virtnet_set_queues() to avoid holding rtnl as the device is not even registered at that time. Cc: stable@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20230725072049.617289-1-jasowang@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2023-06-081-8/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cross-merge networking fixes after downstream PR. Conflicts: net/sched/sch_taprio.c d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping") dced11ef84fb ("net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()") net/ipv4/sysctl_net_ipv4.c e209fee4118f ("net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294") ccce324dabfe ("tcp: make the first N SYN RTO backoffs linear") https://lore.kernel.org/all/20230605100816.08d41a7b@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * virtio_net: use control_buf for coalesce paramsBrett Creeley2023-06-061-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 699b045a8e43 ("net: virtio_net: notifications coalescing support") added coalescing command support for virtio_net. However, the coalesce commands are using buffers on the stack, which is causing the device to see DMA errors. There should also be a complaint from check_for_stack() in debug_dma_map_xyz(). Fix this by adding and using coalesce params from the control_buf struct, which aligns with other commands. Cc: stable@vger.kernel.org Fixes: 699b045a8e43 ("net: virtio_net: notifications coalescing support") Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Allen Hubbe <allen.hubbe@amd.com> Signed-off-by: Brett Creeley <brett.creeley@amd.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20230605195925.51625-1-brett.creeley@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2023-05-181-17/+44
|\| | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/freescale/fec_main.c 6ead9c98cafc ("net: fec: remove the xdp_return_frame when lack of tx BDs") 144470c88c5d ("net: fec: using the standard return codes when xdp xmit errors") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * virtio_net: Fix error unwinding of XDP initializationFeng Liu2023-05-151-17/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When initializing XDP in virtnet_open(), some rq xdp initialization may hit an error causing net device open failed. However, previous rqs have already initialized XDP and enabled NAPI, which is not the expected behavior. Need to roll back the previous rq initialization to avoid leaks in error unwinding of init code. Also extract helper functions of disable and enable queue pairs. Use newly introduced disable helper function in error unwinding and virtnet_close. Use enable helper function in virtnet_open. Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info") Signed-off-by: Feng Liu <feliu@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: introduce and use skb_frag_fill_page_desc()Yunsheng Lin2023-05-131-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most users use __skb_frag_set_page()/skb_frag_off_set()/ skb_frag_size_set() to fill the page desc for a skb frag. Introduce skb_frag_fill_page_desc() to do that. net/bpf/test_run.c does not call skb_frag_off_set() to set the offset, "copy_from_user(page_address(page), ...)" and 'shinfo' being part of the 'data' kzalloced in bpf_test_init() suggest that it is assuming offset to be initialized as zero, so call skb_frag_fill_page_desc() with offset being zero for this case. Also, skb_frag_set_page() is not used anymore, so remove it. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | virtio_net: introduce virtnet_build_skb()Xuan Zhuo2023-05-091-13/+21
| | | | | | | | | | | | | | | | | | This logic is used in multiple places, now we separate it into a helper. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: introduce receive_small_build_xdpXuan Zhuo2023-05-091-17/+31
| | | | | | | | | | | | | | | | | | Simplifying receive_small() function. Bringing the logic relating to build_skb together. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: small: remove skip_xdpXuan Zhuo2023-05-091-14/+12
| | | | | | | | | | | | | | | | | | | | Because the skb build code is not shared between xdp and non-xdp, and the xdp code in receive_small() is simpler, so "skip_xdp" is not needed. We can remove it. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: small: avoid code duplication in xdp scenariosXuan Zhuo2023-05-091-4/+8
| | | | | | | | | | | | | | | | | | Avoid the problem that some variables(headroom and so on) will repeat the calculation when process xdp. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: small: remove the deltaXuan Zhuo2023-05-091-5/+1
| | | | | | | | | | | | | | | | | | | | In the case of XDP-PASS, skb_reserve uses the "delta" to compatible non-XDP, now that is not shared between xdp and non-xdp, so we can remove this logic. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: introduce receive_small_xdp()Xuan Zhuo2023-05-091-65/+100
| | | | | | | | | | | | | | | | | | The purpose of this patch is to simplify the receive_small(). Separate all the logic of XDP of small into a function. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: merge: remove skip_xdpXuan Zhuo2023-05-091-13/+10
| | | | | | | | | | | | | | | | | | Now, the logic of merge xdp process is simple, we can remove the skip_xdp. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: introduce receive_mergeable_xdp()Xuan Zhuo2023-05-091-41/+64
| | | | | | | | | | | | | | | | | | The purpose of this patch is to simplify the receive_mergeable(). Separate all the logic of XDP into a function. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: virtnet_build_xdp_buff_mrg() auto release xdp shinfoXuan Zhuo2023-05-091-3/+7
| | | | | | | | | | | | | | | | | | virtnet_build_xdp_buff_mrg() auto release xdp shinfo then the caller no need to careful the xdp shinfo. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: separate the logic of freeing the rest mergeable bufXuan Zhuo2023-05-091-12/+24
| | | | | | | | | | | | | | | | | | This patch introduce a new function that frees the rest mergeable buf. The subsequent patch will reuse this function. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: separate the logic of freeing xdp shinfoXuan Zhuo2023-05-091-11/+16
| | | | | | | | | | | | | | | | | | This patch introduce a new function that releases the xdp shinfo. The subsequent patch will reuse this function. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: introduce virtnet_xdp_handler() to seprate the logic of run xdpXuan Zhuo2023-05-091-60/+58
| | | | | | | | | | | | | | | | | | | | | | At present, we have two similar logic to perform the XDP prog. Therefore, this patch separates the code of executing XDP, which is conducive to later maintenance. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: optimize mergeable_xdp_get_buf()Xuan Zhuo2023-05-091-15/+14
| | | | | | | | | | | | | | | | | | | | | | | | The previous patch, in order to facilitate review, I do not do any modification. This patch has made some optimization on the top. * remove some repeated logics in this function. * add fast check for passing without any alloc. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: introduce mergeable_xdp_get_buf()Xuan Zhuo2023-05-091-56/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Separating the logic of preparation for xdp from receive_mergeable. The purpose of this is to simplify the logic of execution of XDP. The main logic here is that when headroom is insufficient, we need to allocate a new page and calculate offset. It should be noted that if there is new page, the variable page will refer to the new page. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | virtio_net: mergeable xdp: put old page immediatelyXuan Zhuo2023-05-091-12/+7
|/ | | | | | | | | | | | | | | | | | | | | | In the xdp implementation of virtio-net mergeable, it always checks whether two page is used and a page is selected to release. This is complicated for the processing of action, and be careful. In the entire process, we have such principles: * If xdp_page is used (PASS, TX, Redirect), then we release the old page. * If it is a drop case, we will release two. The old page obtained from buf is release inside err_xdp, and xdp_page needs be relased by us. But in fact, when we allocate a new page, we can release the old page immediately. Then just one is using, we just need to release the new page for drop case. On the drop path, err_xdp will release the variable "page", so we only need to let "page" point to the new xdp_page in advance. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* virtio_net: suppress cpu stall when free_unused_bufsWenliang Wang2023-05-051-0/+2
| | | | | | | | | | | For multi-queue and large ring-size use case, the following error occurred when free_unused_bufs: rcu: INFO: rcu_sched self-detected stall on CPU. Fixes: 986a4f4d452d ("virtio_net: multiqueue support") Signed-off-by: Wenliang Wang <wangwenliang.1995@bytedance.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2023-04-201-2/+6
|\ | | | | | | | | | | | | | | | | | | | | Adjacent changes: net/mptcp/protocol.h 63740448a32e ("mptcp: fix accept vs worker race") 2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close") ddb1a072f858 ("mptcp: move first subflow allocation at mpc access time") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * virtio_net: bugfix overflow inside xdp_linearize_page()Xuan Zhuo2023-04-171-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here we copy the data from the original buf to the new page. But we not check that it may be overflow. As long as the size received(including vnethdr) is greater than 3840 (PAGE_SIZE -VIRTIO_XDP_HEADROOM). Then the memcpy will overflow. And this is completely possible, as long as the MTU is large, such as 4096. In our test environment, this will cause crash. Since crash is caused by the written memory, it is meaningless, so I do not include it. Fixes: 72979a6c3590 ("virtio_net: xdp, add slowpath case for non contiguous buffers") Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2023-03-171-76/+95
|\| | | | | | | | | | | | | | | | | | | | | | | | | net/wireless/nl80211.c b27f07c50a73 ("wifi: nl80211: fix puncturing bitmap policy") cbbaf2bb829b ("wifi: nl80211: add a command to enable/disable HW timestamping") https://lore.kernel.org/all/20230314105421.3608efae@canb.auug.org.au tools/testing/selftests/net/Makefile 62199e3f1658 ("selftests: net: Add VXLAN MDB test") 13715acf8ab5 ("selftest: Add test for bind() conflicts.") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * virtio_net: free xdp shinfo frags when build_skb_from_xdp_buff() failsXuan Zhuo2023-03-161-1/+4
| | | | | | | | | | | | | | | | | | | | | | build_skb_from_xdp_buff() may return NULL, in this case we need to free the frags of xdp shinfo. Fixes: fab89bafa95b ("virtio-net: support multi-buffer xdp") Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * virtio_net: fix page_to_skb() miss headroomXuan Zhuo2023-03-161-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because headroom is not passed to page_to_skb(), this causes the shinfo exceeds the range. Then the frags of shinfo are changed by other process. [ 157.724634] stack segment: 0000 [#1] PREEMPT SMP NOPTI [ 157.725358] CPU: 3 PID: 679 Comm: xdp_pass_user_f Tainted: G E 6.2.0+ #150 [ 157.726401] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/4 [ 157.727820] RIP: 0010:skb_release_data+0x11b/0x180 [ 157.728449] Code: 44 24 02 48 83 c3 01 39 d8 7e be 48 89 d8 48 c1 e0 04 41 80 7d 7e 00 49 8b 6c 04 30 79 0c 48 89 ef e8 89 b [ 157.730751] RSP: 0018:ffffc90000178b48 EFLAGS: 00010202 [ 157.731383] RAX: 0000000000000010 RBX: 0000000000000001 RCX: 0000000000000000 [ 157.732270] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff888100dd0b00 [ 157.733117] RBP: 5d5d76010f6e2408 R08: ffff888100dd0b2c R09: 0000000000000000 [ 157.734013] R10: ffffffff82effd30 R11: 000000000000a14e R12: ffff88810981ffc0 [ 157.734904] R13: ffff888100dd0b00 R14: 0000000000000002 R15: 0000000000002310 [ 157.735793] FS: 00007f06121d9740(0000) GS:ffff88842fcc0000(0000) knlGS:0000000000000000 [ 157.736794] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 157.737522] CR2: 00007ffd9a56c084 CR3: 0000000104bda001 CR4: 0000000000770ee0 [ 157.738420] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 157.739283] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 157.740146] PKRU: 55555554 [ 157.740502] Call Trace: [ 157.740843] <IRQ> [ 157.741117] kfree_skb_reason+0x50/0x120 [ 157.741613] __udp4_lib_rcv+0x52b/0x5e0 [ 157.742132] ip_protocol_deliver_rcu+0xaf/0x190 [ 157.742715] ip_local_deliver_finish+0x77/0xa0 [ 157.743280] ip_sublist_rcv_finish+0x80/0x90 [ 157.743834] ip_list_rcv_finish.constprop.0+0x16f/0x190 [ 157.744493] ip_list_rcv+0x126/0x140 [ 157.744952] __netif_receive_skb_list_core+0x29b/0x2c0 [ 157.745602] __netif_receive_skb_list+0xed/0x160 [ 157.746190] ? udp4_gro_receive+0x275/0x350 [ 157.746732] netif_receive_skb_list_internal+0xf2/0x1b0 [ 157.747398] napi_gro_receive+0xd1/0x210 [ 157.747911] virtnet_receive+0x75/0x1c0 [ 157.748422] virtnet_poll+0x48/0x1b0 [ 157.748878] __napi_poll+0x29/0x1b0 [ 157.749330] net_rx_action+0x27a/0x340 [ 157.749812] __do_softirq+0xf3/0x2fb [ 157.750298] do_softirq+0xa2/0xd0 [ 157.750745] </IRQ> [ 157.751563] <TASK> [ 157.752329] __local_bh_enable_ip+0x6d/0x80 [ 157.753178] virtnet_xdp_set+0x482/0x860 [ 157.754159] ? __pfx_virtnet_xdp+0x10/0x10 [ 157.755129] dev_xdp_install+0xa4/0xe0 [ 157.756033] dev_xdp_attach+0x20b/0x5e0 [ 157.756933] do_setlink+0x82e/0xc90 [ 157.757777] ? __nla_validate_parse+0x12b/0x1e0 [ 157.758744] rtnl_setlink+0xd8/0x170 [ 157.759549] ? mod_objcg_state+0xcb/0x320 [ 157.760328] ? security_capable+0x37/0x60 [ 157.761209] ? security_capable+0x37/0x60 [ 157.762072] rtnetlink_rcv_msg+0x145/0x3d0 [ 157.762929] ? ___slab_alloc+0x327/0x610 [ 157.763754] ? __alloc_skb+0x141/0x170 [ 157.764533] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 157.765422] netlink_rcv_skb+0x58/0x110 [ 157.766229] netlink_unicast+0x21f/0x330 [ 157.766951] netlink_sendmsg+0x240/0x4a0 [ 157.767654] sock_sendmsg+0x93/0xa0 [ 157.768434] ? sockfd_lookup_light+0x12/0x70 [ 157.769245] __sys_sendto+0xfe/0x170 [ 157.770079] ? handle_mm_fault+0xe9/0x2d0 [ 157.770859] ? preempt_count_add+0x51/0xa0 [ 157.771645] ? up_read+0x3c/0x80 [ 157.772340] ? do_user_addr_fault+0x1e9/0x710 [ 157.773166] ? kvm_read_and_reset_apf_flags+0x49/0x60 [ 157.774087] __x64_sys_sendto+0x29/0x30 [ 157.774856] do_syscall_64+0x3c/0x90 [ 157.775518] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 157.776382] RIP: 0033:0x7f06122def70 Fixes: 18117a842ab0 ("virtio-net: remove xdp related info from page_to_skb()") Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * virtio_net: add checking sq is full inside xdp xmitXuan Zhuo2023-03-101-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the queue of xdp xmit is not an independent queue, then when the xdp xmit used all the desc, the xmit from the __dev_queue_xmit() may encounter the following error. net ens4: Unexpected TXQ (0) queue failure: -28 This patch adds a check whether sq is full in xdp xmit. Fixes: 56434a01b12e ("virtio_net: add XDP_TX support") Reported-by: Yichun Zhang <yichun@openresty.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * virtio_net: separate the logic of checking whether sq is fullXuan Zhuo2023-03-101-24/+36
| | | | | | | | | | | | | | | | | | | | | | Separate the logic of checking whether sq is full. The subsequent patch will reuse this func. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * virtio_net: reorder some funcsXuan Zhuo2023-03-101-46/+46
| | | | | | | | | | | | | | | | | | The purpose of this is to facilitate the subsequent addition of new functions without introducing a separate declaration. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | net: virtio_net: implement exact header length guest featureJiri Pirko2023-03-131-2/+4
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Virtio spec introduced a feature VIRTIO_NET_F_GUEST_HDRLEN which when set implicates that device benefits from knowing the exact size of the header. For compatibility, to signal to the device that the header is reliable driver also needs to set this feature. Without this feature set by driver, device has to figure out the header size itself. Quoting the original virtio spec: "hdr_len is a hint to the device as to how much of the header needs to be kept to copy into each packet" "a hint" might not be clear for the reader what does it mean, if it is "maybe like that" of "exactly like that". This feature just makes it crystal clear and let the device count on the hdr_len being filled up by the exact length of header. Also note the spec already has following note about hdr_len: "Due to various bugs in implementations, this field is not useful as a guarantee of the transport header size." Without this feature the device needs to parse the header in core data path handling. Accurate information helps the device to eliminate such header parsing and directly use the hardware accelerators for GSO operation. virtio_net_hdr_from_skb() fills up hdr_len to skb_headlen(skb). The driver already complies to fill the correct value. Introduce the feature and advertise it. Note that virtio spec also includes following note for device implementation: "Caution should be taken by the implementation so as to prevent a malicious driver from attacking the device by setting an incorrect hdr_len." There is a plan to support this feature in our emulated device. A device of SolidRun offers this feature bit. They claim this feature will save the device a few cycles for every GSO packet. Link: https://docs.oasis-open.org/virtio/virtio/v1.2/cs01/virtio-v1.2-cs01.html#x1-230006x3 Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Alvaro Karsz <alvaro.karsz@solid-run.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230309094559.917857-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Daniel Borkmann says:Jakub Kicinski2023-02-101-1/+7
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ==================== pull-request: bpf-next 2023-02-11 We've added 96 non-merge commits during the last 14 day(s) which contain a total of 152 files changed, 4884 insertions(+), 962 deletions(-). There is a minor conflict in drivers/net/ethernet/intel/ice/ice_main.c between commit 5b246e533d01 ("ice: split probe into smaller functions") from the net-next tree and commit 66c0e13ad236 ("drivers: net: turn on XDP features") from the bpf-next tree. Remove the hunk given ice_cfg_netdev() is otherwise there a 2nd time, and add XDP features to the existing ice_cfg_netdev() one: [...] ice_set_netdev_features(netdev); netdev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT | NETDEV_XDP_ACT_XSK_ZEROCOPY; ice_set_ops(netdev); [...] Stephen's merge conflict mail: https://lore.kernel.org/bpf/20230207101951.21a114fa@canb.auug.org.au/ The main changes are: 1) Add support for BPF trampoline on s390x which finally allows to remove many test cases from the BPF CI's DENYLIST.s390x, from Ilya Leoshkevich. 2) Add multi-buffer XDP support to ice driver, from Maciej Fijalkowski. 3) Add capability to export the XDP features supported by the NIC. Along with that, add a XDP compliance test tool, from Lorenzo Bianconi & Marek Majtyka. 4) Add __bpf_kfunc tag for marking kernel functions as kfuncs, from David Vernet. 5) Add a deep dive documentation about the verifier's register liveness tracking algorithm, from Eduard Zingerman. 6) Fix and follow-up cleanups for resolve_btfids to be compiled as a host program to avoid cross compile issues, from Jiri Olsa & Ian Rogers. 7) Batch of fixes to the BPF selftest for xdp_hw_metadata which resulted when testing on different NICs, from Jesper Dangaard Brouer. 8) Fix libbpf to better detect kernel version code on Debian, from Hao Xiang. 9) Extend libbpf to add an option for when the perf buffer should wake up, from Jon Doron. 10) Follow-up fix on xdp_metadata selftest to just consume on TX completion, from Stanislav Fomichev. 11) Extend the kfuncs.rst document with description on kfunc lifecycle & stability expectations, from David Vernet. 12) Fix bpftool prog profile to skip attaching to offline CPUs, from Tonghao Zhang. ==================== Link: https://lore.kernel.org/r/20230211002037.8489-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * virtio_net: Update xdp_features with xdp multi-buffLorenzo Bianconi2023-02-071-2/+4
| | | | | | | | | | | | | | | | | | Now virtio-net supports xdp multi-buffer so add it to xdp_features. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/bpf/60c76cd63a0246db785606e8891b925fd5c9bf06.1675763384.git.lorenzo@kernel.org
| * drivers: net: turn on XDP featuresMarek Majtyka2023-02-021-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A summary of the flags being set for various drivers is given below. Note that XDP_F_REDIRECT_TARGET and XDP_F_FRAG_TARGET are features that can be turned off and on at runtime. This means that these flags may be set and unset under RTNL lock protection by the driver. Hence, READ_ONCE must be used by code loading the flag value. Also, these flags are not used for synchronization against the availability of XDP resources on a device. It is merely a hint, and hence the read may race with the actual teardown of XDP resources on the device. This may change in the future, e.g. operations taking a reference on the XDP resources of the driver, and in turn inhibiting turning off this flag. However, for now, it can only be used as a hint to check whether device supports becoming a redirection target. Turn 'hw-offload' feature flag on for: - netronome (nfp) - netdevsim. Turn 'native' and 'zerocopy' features flags on for: - intel (i40e, ice, ixgbe, igc) - mellanox (mlx5). - stmmac - netronome (nfp) Turn 'native' features flags on for: - amazon (ena) - broadcom (bnxt) - freescale (dpaa, dpaa2, enetc) - funeth - intel (igb) - marvell (mvneta, mvpp2, octeontx2) - mellanox (mlx4) - mtk_eth_soc - qlogic (qede) - sfc - socionext (netsec) - ti (cpsw) - tap - tsnep - veth - xen - virtio_net. Turn 'basic' (tx, pass, aborted and drop) features flags on for: - netronome (nfp) - cavium (thunder) - hyperv. Turn 'redirect_target' feature flag on for: - amanzon (ena) - broadcom (bnxt) - freescale (dpaa, dpaa2) - intel (i40e, ice, igb, ixgbe) - ti (cpsw) - marvell (mvneta, mvpp2) - sfc - socionext (netsec) - qlogic (qede) - mellanox (mlx5) - tap - veth - virtio_net - xen Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Stanislav Fomichev <sdf@google.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Marek Majtyka <alardam@gmail.com> Link: https://lore.kernel.org/r/3eca9fafb308462f7edb1f58e451d59209aa07eb.1675245258.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* | virtio-net: Maintain reverse cleanup orderParav Pandit2023-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | To easily audit the code, better to keep the device stop() sequence to be mirror of the device open() sequence. Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2023-02-021-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | net/core/gro.c 7d2c89b32587 ("skb: Do mix page pool and page referenced frags in GRO") b1a78b9b9886 ("net: add support for ipv4 big tcp") https://lore.kernel.org/all/20230203094454.5766f160@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | virtio-net: Keep stop() to follow mirror sequence of open()Parav Pandit2023-02-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cited commit in fixes tag frees rxq xdp info while RQ NAPI is still enabled and packet processing may be ongoing. Follow the mirror sequence of open() in the stop() callback. This ensures that when rxq info is unregistered, no rx packet processing is ongoing. Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info") Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Parav Pandit <parav@nvidia.com> Link: https://lore.kernel.org/r/20230202163516.12559-1-parav@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | | virtio-net: fix possible unsigned integer overflowHeng Qi2023-02-011-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the single-buffer xdp is loaded and after xdp_linearize_page() is called, *num_buf becomes 0 and (*num_buf - 1) may overflow into a large integer in virtnet_build_xdp_buff_mrg(), resulting in unexpected packet dropping. Fixes: ef75cb51f139 ("virtio-net: build xdp_buff with multi buffers") Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20230131085004.98687-1-hengqi@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | | virtio_net: notify MAC address change on device initializationLaurent Vivier2023-02-011-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In virtnet_probe(), if the device doesn't provide a MAC address the driver assigns a random one. As we modify the MAC address we need to notify the device to allow it to update all the related information. The problem can be seen with vDPA and mlx5_vdpa driver as it doesn't assign a MAC address by default. The virtio_net device uses a random MAC address (we can see it with "ip link"), but we can't ping a net namespace from another one using the virtio-vdpa device because the new MAC address has not been provided to the hardware: RX packets are dropped since they don't go through the receive filters, TX packets go through unaffected. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | | virtio_net: disable VIRTIO_NET_F_STANDBY if VIRTIO_NET_F_MAC is not setLaurent Vivier2023-02-011-0/+6
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | failover relies on the MAC address to pair the primary and the standby devices: "[...] the hypervisor needs to enable VIRTIO_NET_F_STANDBY feature on the virtio-net interface and assign the same MAC address to both virtio-net and VF interfaces." Documentation/networking/net_failover.rst This patch disables the STANDBY feature if the MAC address is not provided by the hypervisor. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>