summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ice: Add tx_scheduling_layers devlink paramLukasz Czapnik2024-04-226-10/+191
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was observed that Tx performance was inconsistent across all queues and/or VSIs and that it was directly connected to existing 9-layer topology of the Tx scheduler. Introduce new private devlink param - tx_scheduling_layers. This parameter gives user flexibility to choose the 5-layer transmit scheduler topology which helps to smooth out the transmit performance. Allowed parameter values are 5 and 9. Example usage: Show: devlink dev param show pci/0000:4b:00.0 name tx_scheduling_layers pci/0000:4b:00.0: name tx_scheduling_layers type driver-specific values: cmode permanent value 9 Set: devlink dev param set pci/0000:4b:00.0 name tx_scheduling_layers value 5 cmode permanent devlink dev param set pci/0000:4b:00.0 name tx_scheduling_layers value 9 cmode permanent Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* ice: Enable switching default Tx scheduler topologyMichal Wilczynski2024-04-221-19/+89
| | | | | | | | | | | | | | | | | | | | Introduce support for Tx scheduler topology change, based on user selection, from default 9-layer to 5-layer. Change requires NVM (version 3.20 or newer) and DDP package (OS Package 1.3.30 or newer - available for over a year in linux-firmware, since commit aed71f296637 in linux-firmware ("ice: Update package to 1.3.30.0")) https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=aed71f296637 Enable 5-layer topology switch in init path of the driver. To accomplish that upload of the DDP package needs to be delayed, until change in Tx topology is finished. To trigger the Tx change user selection should be changed in NVM using devlink. Then the platform should be rebooted. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* ice: Adjust the VSI/Aggregator layersRaj Victor2024-04-221-18/+19
| | | | | | | | | | | | | | | Adjust the VSI/Aggregator layers based on the number of logical layers supported by the FW. Currently the VSI and Aggregator layers are fixed based on the 9 layer scheduler tree layout. Due to performance reasons the number of layers of the scheduler tree is changing from 9 to 5. It requires a readjustment of these VSI/Aggregator layer values. Signed-off-by: Raj Victor <victor.raj@intel.com> Co-developed-by: Michal Wilczynski <michal.wilczynski@intel.com> Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* ice: Support 5 layer topologyRaj Victor2024-04-226-0/+251
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a performance issue when the number of VSIs are not multiple of 8. This is caused due to the max children limitation per node(8) in 9 layer topology. The BW credits are shared evenly among the children by default. Assume one node has 8 children and the other has 1. The parent of these nodes share the BW credit equally among them. Apparently this causes a problem for the first node which has 8 children. The 9th VM get more BW credits than the first 8 VMs. Example: 1) With 8 VM's: for x in 0 1 2 3 4 5 6 7; do taskset -c ${x} netperf -P0 -H 172.68.169.125 & sleep .1 ; done tx_queue_0_packets: 23283027 tx_queue_1_packets: 23292289 tx_queue_2_packets: 23276136 tx_queue_3_packets: 23279828 tx_queue_4_packets: 23279828 tx_queue_5_packets: 23279333 tx_queue_6_packets: 23277745 tx_queue_7_packets: 23279950 tx_queue_8_packets: 0 2) With 9 VM's: for x in 0 1 2 3 4 5 6 7 8; do taskset -c ${x} netperf -P0 -H 172.68.169.125 & sleep .1 ; done tx_queue_0_packets: 24163396 tx_queue_1_packets: 24164623 tx_queue_2_packets: 24163188 tx_queue_3_packets: 24163701 tx_queue_4_packets: 24163683 tx_queue_5_packets: 24164668 tx_queue_6_packets: 23327200 tx_queue_7_packets: 24163853 tx_queue_8_packets: 91101417 So on average queue 8 statistics show that 3.7 times more packets were send there than to the other queues. The FW starting with version 3.20, has increased the max number of children per node by reducing the number of layers from 9 to 5. Reflect this on driver side. Signed-off-by: Raj Victor <victor.raj@intel.com> Co-developed-by: Michal Wilczynski <michal.wilczynski@intel.com> Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* devlink: extend devlink_param *set pointerMateusz Polchlopek2024-04-2222-37/+66
| | | | | | | | | | | | Extend devlink_param *set function pointer to take extack as a param. Sometimes it is needed to pass information to the end user from set function. It is more proper to use for that netlink instead of passing message to dmesg. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* tcp: do not export tcp_twsk_purge()Eric Dumazet2024-04-221-1/+0
| | | | | | | | | | | After commit 1eeb50435739 ("tcp/dccp: do not care about families in inet_twsk_purge()") tcp_twsk_purge() is no longer potentially called from a module. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* octeontx2-pf: Add support for offload tc with skbedit mark actionGeetha sowjanya2024-04-226-0/+24
| | | | | | | | | | | | | | | | Support offloading of skbedit mark action. For example, to mark with 0x0008, with dest ip 60.60.60.2 on eth2 interface: # tc qdisc add dev eth2 ingress # tc filter add dev eth2 ingress protocol ip flower \ dst_ip 60.60.60.2 action skbedit mark 0x0008 skip_sw Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ethernet: ti: am65-cpsw: Fix xdp_rxq error for disabled portJulien Panis2024-04-221-0/+6
| | | | | | | | | | | | | | | | When an ethX port is disabled in the device tree, an error is returned by xdp_rxq_info_reg() function while transitioning the CPSW device to the up state. The message 'Missing net_device from driver' is output. This patch fixes the issue by registering xdp_rxq info only if ethX port is enabled (i.e. ndev pointer is not NULL). Fixes: 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support") Link: https://lore.kernel.org/all/260d258f-87a1-4aac-8883-aab4746b32d8@ti.com/ Reported-by: Siddharth Vadapalli <s-vadapalli@ti.com> Closes: https://gist.github.com/Siddharth-Vadapalli-at-TI/5ed0e436606001c247a7da664f75edee Signed-off-by: Julien Panis <jpanis@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sysctl: treewide: constify ctl_table_header::ctl_table_argThomas Weißschuh2024-04-2227-30/+30
| | | | | | | | | | | | | | | To be able to constify instances of struct ctl_tables it is necessary to remove ways through which non-const versions are exposed from the sysctl core. One of these is the ctl_table_arg member of struct ctl_table_header. Constify this reference as a prerequisite for the full constification of struct ctl_table instances. No functional change. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'testing-make-netfilter-selftests-functional-in-vng-environment'Jakub Kicinski2024-04-1912-520/+498
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Florian Westphal says: ==================== testing: make netfilter selftests functional in vng environment This is the second batch of the netfilter selftest move. Changes since v1: - makefile and kernel config are updated to have all required features - fix makefile with missing bits to make kselftest-install work - test it via vng as per https://github.com/linux-netdev/nipa/wiki/How-to-run-netdev-selftests-CI-style (Thanks Jakub!) - squash a few fixes, e.g. nft_queue.sh v1 had a race w. NFNETLINK_QUEUE=m - add a settings file with 8m timeout, for nft_concat_range.sh sake. That script can be sped up a bit, I think, but its not contained in this batch yet. - toss the first two bogus rebase artifacts (Matthieu Baerts) scripts are moved to lib.sh infra. This allows to use busywait helper and ditch various 'sleep 2' all over the place. Tested on Fedora 39: vng --build --config tools/testing/selftests/net/netfilter/config make -C tools/testing/selftests/ TARGETS=net/netfilter vng -v --run . --user root --cpus 2 -- \ make -C tools/testing/selftests TARGETS=net/netfilter run_tests ... all tests pass except nft_audit.sh which SKIPs due to nft version mismatch (Fedora is on nft 1.0.7 which lacks reset keyword support). Missing/WIP bits: - speed up nf_concat_range.sh test - extend flowtable selftest - shellcheck fixups for remaining scripts ==================== Link: https://lore.kernel.org/r/20240418152744.15105-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: netfilter: update makefiles and kernel configFlorian Westphal2024-04-193-1/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jakub reports the Makefile missed a few updates to make kselftest-install work for the netfilter tests and points out that config file lacks many dependencies such as VETH support. The settings file (timeout 8m) is added for nft_concat_range.sh script which can take several minutes to complete. Fixes: 3f189349e52a ("selftests: netfilter: move to net subdir") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/all/20240412175413.04e5e616@kernel.org/ Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240418152744.15105-13-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: netfilter: nft_audit.sh: add more skip checksFlorian Westphal2024-04-191-4/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This testcase doesn't work if auditd is running, audit_logread will not receive any data in that case. Add a nftables feature test for the reset keyword and skip this test if that fails. While at it, do a few minor shellcheck cleanups. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240418152744.15105-12-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: netfilter: nft_meta.sh: small shellcheck cleanupFlorian Westphal2024-04-191-2/+2
| | | | | | | | | | | | | | | | shellcheck complains about missing "", so add those. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240418152744.15105-11-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: netfilter: nft_fib.sh: shellcheck cleanupsFlorian Westphal2024-04-191-67/+61
| | | | | | | | | | | | | | | | no functional change intended. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240418152744.15105-10-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: netfilter: conntrack_ipip_mtu.sh: shellcheck cleanupsFlorian Westphal2024-04-191-37/+37
| | | | | | | | | | | | | | | | No functional change intended. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240418152744.15105-9-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: netfilter: nft_nat_zones.sh: shellcheck cleanupsFlorian Westphal2024-04-191-118/+75
| | | | | | | | | | | | | | | | | | While at it: No need for iperf here, use socat. This also reduces the script runtime. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240418152744.15105-8-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: netfilter: xt_string.sh: shellcheck cleanupsFlorian Westphal2024-04-191-17/+17
| | | | | | | | | | | | | | | | no functional change intended. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240418152744.15105-7-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: netfilter: xt_string.sh: move to lib.sh infraFlorian Westphal2024-04-191-25/+30
| | | | | | | | | | | | | | | | | | | | | | | | Intentional changes: - Use socat instead of netcat - Use a temporary file instead of pipe, else packets do not match "-m string" rules, multiple writes to the pipe cause multiple packets, but this needs only one to work. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240418152744.15105-6-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: netfilter: nft_zones_many.sh: move to lib.sh infraFlorian Westphal2024-04-191-48/+45
| | | | | | | | | | | | | | | | | | | | | | | | Also do shellcheck cleanups here, no functional changes intended. When running tests via vng tool, the packetpath insertion test fails: dd: failed to open '/dev/stdout': Device or resource busy Just omit 'of=' and this will work as intended. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240418152744.15105-5-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: netfilter: nft_synproxy.sh: move to lib.sh infraFlorian Westphal2024-04-191-49/+28
| | | | | | | | | | | | | | | | use checktool helper where applicable. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240418152744.15105-4-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: netfilter: nft_queue.sh: shellcheck cleanupsFlorian Westphal2024-04-191-108/+103
| | | | | | | | | | | | | | | | | | No functional change intended. Disable frequent shellcheck warnings wrt. "unreachable" code, those helpers get called indirectly from busywait helper. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240418152744.15105-3-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: netfilter: nft_queue.sh: move to lib.sh infraFlorian Westphal2024-04-191-61/+34
|/ | | | | | | | | | | - switch to socat, like other tests - use buswait helper to test once listener netns is ready - do not generate multiple input test files, only generate one and use cleanup hook to remove it, like other temporary files. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240418152744.15105-2-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge branch 'net-neigh-rcu'David S. Miller2024-04-191-32/+36
|\ | | | | | | | | | | | | | | | | | | | | | | Eric Dumazet says: ==================== neighbour: convert neigh_dump_info() to RCU Remove RTNL requirement for "ip neighbour show" command. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * neighbour: no longer hold RTNL in neigh_dump_info()Eric Dumazet2024-04-191-4/+5
| | | | | | | | | | | | | | | | | | neigh_dump_table() is already relying on RCU protection. pneigh_dump_table() is using its own protection (tbl->lock) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * neighbour: fix neigh_dump_info() return valueEric Dumazet2024-04-191-18/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change neigh_dump_table() and pneigh_dump_table() to either return 0 or -EMSGSIZE if not enough space was available in the skb. Then neigh_dump_info() can do the same. This allows NLMSG_DONE to be appended to the current skb at the end of a dump, saving a couple of recvmsg() system calls. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * neighbour: add RCU protection to neigh_tables[]Eric Dumazet2024-04-191-11/+19
|/ | | | | | | | | In order to remove RTNL protection from neightbl_dump_info() and neigh_dump_info() later, we need to add RCU protection to neigh_tables[]. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: xrs700x: fix missing initialisation of ds->phylink_mac_opsRussell King (Oracle)2024-04-191-0/+1
| | | | | | | | | | | | | | | | The kernel build bot identified the following mistake in the recently merged 860a9bed2651 ("net: dsa: xrs700x: provide own phylink MAC operations") patch: drivers/net/dsa/xrs700x/xrs700x.c:714:37: warning: 'xrs700x_phylink_mac_ops' defined but not used [-Wunused-const-variable=] 714 | static const struct phylink_mac_ops xrs700x_phylink_mac_ops = { | ^~~~~~~~~~~~~~~~~~~~~~~ Fix the omitted assignment of ds->phylink_mac_ops. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'net-rps-lockless'David S. Miller2024-04-191-9/+9
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jason Xing says: ==================== locklessly protect left members in struct rps_dev_flow From: Jason Xing <kernelxing@tencent.com> Since Eric did a more complicated locklessly change to last_qtail member[1] in struct rps_dev_flow, the left members are easier to change as the same. One thing important I would like to share by qooting Eric: "rflow is located in rxqueue->rps_flow_table, it is thus private to current thread. Only one cpu can service an RX queue at a time." So we only pay attention to the reader in the rps_may_expire_flow() and writer in the set_rps_cpu(). They are in the two different contexts. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=3b4cf29bdab v3 Link: https://lore.kernel.org/all/20240417062721.45652-1-kerneljasonxing@gmail.com/ 1. adjust the protection in a right way (Eric) v2 1. fix passing wrong type qtail. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: rps: locklessly access rflow->cpuJason Xing2024-04-191-1/+1
| | | | | | | | | | | | | | | | | | This is the last member in struct rps_dev_flow which should be protected locklessly. So finish it. Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: rps: protect filter locklesslyJason Xing2024-04-191-4/+4
| | | | | | | | | | | | | | | | | | As we can see, rflow->filter can be written/read concurrently, so lockless access is needed. Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: rps: protect last_qtail with rps_input_queue_tail_save() helperJason Xing2024-04-191-4/+4
|/ | | | | | | | | | | | Removing one unnecessary reader protection and add another writer protection to finish the locklessly proctection job. Note: the removed READ_ONCE() is not needed because we only have to protect the locklessly reader in the different context (rps_may_expire_flow()). Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'net_sched-dump-no-rtnl'David S. Miller2024-04-1915-234/+323
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eric Dumazet says: ==================== net_sched: first series for RTNL-less qdisc dumps Medium term goal is to implement "tc qdisc show" without needing to acquire RTNL. This first series makes the requested changes in 14 qdisc. Notes : - RTNL is still held in "tc qdisc show", more changes are needed. - Qdisc returning many attributes might want/need to provide a consistent set of attributes. If that is the case, their dump() method could acquire the qdisc spinlock, to pair the spinlock acquision in their change() method. V2: Addressed Simon feedback (Thanks a lot Simon) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: sch_skbprio: implement lockless skbprio_dump()Eric Dumazet2024-04-191-3/+5
| | | | | | | | | | | | | | | | | | | | | | Instead of relying on RTNL, skbprio_dump() can use READ_ONCE() annotation, paired with WRITE_ONCE() one in skbprio_change(). Also add a READ_ONCE(sch->limit) in skbprio_enqueue(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: sch_pie: implement lockless pie_dump()Eric Dumazet2024-04-191-18/+21
| | | | | | | | | | | | | | | | | | Instead of relying on RTNL, pie_dump() can use READ_ONCE() annotations, paired with WRITE_ONCE() ones in pie_change(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: sch_hhf: implement lockless hhf_dump()Eric Dumazet2024-04-191-14/+21
| | | | | | | | | | | | | | | | | | Instead of relying on RTNL, hhf_dump() can use READ_ONCE() annotations, paired with WRITE_ONCE() ones in hhf_change(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: sch_hfsc: implement lockless accesses to q->defclsEric Dumazet2024-04-191-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | Instead of relying on RTNL, hfsc_dump_qdisc() can use READ_ONCE() annotation, paired with WRITE_ONCE() one in hfsc_change_qdisc(). Use READ_ONCE(q->defcls) in hfsc_classify() to no longer acquire qdisc lock from hfsc_change_qdisc(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: sch_fq_pie: implement lockless fq_pie_dump()Eric Dumazet2024-04-191-27/+34
| | | | | | | | | | | | | | | | | | Instead of relying on RTNL, fq_pie_dump() can use READ_ONCE() annotations, paired with WRITE_ONCE() ones in fq_pie_change(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: sch_fq_codel: implement lockless fq_codel_dump()Eric Dumazet2024-04-191-22/+35
| | | | | | | | | | | | | | | | | | Instead of relying on RTNL, fq_codel_dump() can use READ_ONCE() annotations, paired with WRITE_ONCE() ones in fq_codel_change(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: sch_fifo: implement lockless __fifo_dump()Eric Dumazet2024-04-191-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | Instead of relying on RTNL, __fifo_dump() can use READ_ONCE() annotations, paired with WRITE_ONCE() ones in __fifo_init(). Also add missing READ_ONCE(sh->limit) in bfifo_enqueue(), pfifo_enqueue() and pfifo_tail_enqueue(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: sch_ets: implement lockless ets_dump()Eric Dumazet2024-04-191-11/+14
| | | | | | | | | | | | | | | | | | Instead of relying on RTNL, ets_dump() can use READ_ONCE() annotations, paired with WRITE_ONCE() ones in ets_change(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: sch_tfs: implement lockless etf_dump()Eric Dumazet2024-04-191-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | Instead of relying on RTNL, codel_dump() can use READ_ONCE() annotations. There is no etf_change() yet, this patch imply aligns this qdisc with others. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: sch_codel: implement lockless codel_dump()Eric Dumazet2024-04-191-11/+18
| | | | | | | | | | | | | | | | | | Instead of relying on RTNL, codel_dump() can use READ_ONCE() annotations, paired with WRITE_ONCE() ones in codel_change(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: sch_choke: implement lockless choke_dump()Eric Dumazet2024-04-192-16/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | Instead of relying on RTNL, choke_dump() can use READ_ONCE() annotations, paired with WRITE_ONCE() ones in choke_change(). v2: added a WRITE_ONCE(p->Scell_log, Scell_log) per Simon feedback in V1 Removed the READ_ONCE(q->limit) in choke_enqueue() Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: sch_cbs: implement lockless cbs_dump()Eric Dumazet2024-04-191-10/+10
| | | | | | | | | | | | | | | | | | Instead of relying on RTNL, cbs_dump() can use READ_ONCE() annotations, paired with WRITE_ONCE() ones in cbs_change(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: cake: implement lockless cake_dump()Eric Dumazet2024-04-191-47/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | Instead of relying on RTNL, cake_dump() can use READ_ONCE() annotations, paired with WRITE_ONCE() ones in cake_change(). v2: addressed Simon feedback in V1: https://lore.kernel.org/netdev/20240417083549.GA3846178@kernel.org/ Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Toke Høiland-Jørgensen <toke@toke.dk> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: sch_fq: implement lockless fq_dump()Eric Dumazet2024-04-191-39/+69
|/ | | | | | | | | | | Instead of relying on RTNL, fq_dump() can use READ_ONCE() annotations, paired with WRITE_ONCE() in fq_change() v2: Addressed Simon feedback in V1: https://lore.kernel.org/netdev/20240416181915.GT2320920@kernel.org/ Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* gve: Remove qpl_cfg struct since qpl_ids map with queues respectivelyZiwei Xiao2024-04-187-113/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | The qpl_cfg struct was used to make sure that no two different queues are using QPL with the same qpl_id. We can remove that qpl_cfg struct since now the qpl_ids map with the queues respectively as follows: For tx queues: qpl_id = tx_qid For rx queues: qpl_id = max_tx_queues + rx_qid And when XDP is used, it will need the user to reduce the tx queues to be at most half of the max_tx_queues. Then it will use the same number of tx queues starting from the end of existing tx queues for XDP. So the XDP queues will not exceed the max_tx_queues range and will not overlap with the rx queues, where the qpl_ids will not have overlapping too. Considering of that, we remove the qpl_cfg struct to get the qpl_id directly based on the queue id. Unless we are erroneously allocating a rx/tx queue that has already been allocated, we would never allocate the qpl with the same qpl_id twice. In that case, it should fail much earlier than the QPL assignment. Suggested-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Shailend Chand <shailend@google.com> Link: https://lore.kernel.org/r/20240417205757.778551-1-ziweixiao@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge branch 'net: Add support for Power over Ethernet (PoE)'Jakub Kicinski2024-04-1820-97/+3533
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kory Maincent says: ==================== net: Add support for Power over Ethernet (PoE) This patch series aims at adding support for PoE (Power over Ethernet), based on the already existing support for PoDL (Power over Data Line) implementation. In addition, it adds support for two specific PoE controller, the Microchip PD692x0 and the TI TPS23881. ==================== Link: https://lore.kernel.org/all/20240417-feature_poe-v9-0-242293fd1900@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * net: pse-pd: Add TI TPS23881 PSE controller driverKory Maincent (Dent Project)2024-04-183-0/+830
| | | | | | | | | | | | | | | | | | Add a new driver for the TI TPS23881 I2C Power Sourcing Equipment controller. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://lore.kernel.org/r/20240417-feature_poe-v9-14-242293fd1900@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * dt-bindings: net: pse-pd: Add bindings for TPS23881 PSE controllerKory Maincent (Dent Project)2024-04-181-0/+95
| | | | | | | | | | | | | | | | | | | | Add the TPS23881 I2C Power Sourcing Equipment controller device tree bindings documentation. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240417-feature_poe-v9-13-242293fd1900@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>