summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* mptcp: introduce and use mptcp_pm_send_ack()Paolo Abeni2022-07-123-24/+35
| | | | | | | | | | | | | | The in-kernel PM has a bit of duplicate code related to ack generation. Create a new helper factoring out the PM-specific needs and use it in a couple of places. As a bonus, mptcp_subflow_send_ack() is not used anymore outside its own compilation unit and can become static. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: ip_tunnel: use strscpy to replace strlcpyXueBing Chen2022-07-121-2/+2
| | | | | | | | | The strlcpy should not be used because it doesn't limit the source length. Preferred is strscpy. Signed-off-by: XueBing Chen <chenxuebing@jari.cn> Link: https://lore.kernel.org/r/2a08f6c1.e30.181ed8b49ad.Coremail.chenxuebing@jari.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* tcp: make retransmitted SKB fit into the send windowYonglong Li2022-07-121-7/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | current code of __tcp_retransmit_skb only check TCP_SKB_CB(skb)->seq in send window, and TCP_SKB_CB(skb)->seq_end maybe out of send window. If receiver has shrunk his window, and skb is out of new window, it should retransmit a smaller portion of the payload. test packetdrill script: 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress) +0 > S 0:0(0) win 65535 <mss 1460,sackOK,TS val 100 ecr 0,nop,wscale 8> +.05 < S. 0:0(0) ack 1 win 6000 <mss 1000,nop,nop,sackOK> +0 > . 1:1(0) ack 1 +0 write(3, ..., 10000) = 10000 +0 > . 1:2001(2000) ack 1 win 65535 +0 > . 2001:4001(2000) ack 1 win 65535 +0 > . 4001:6001(2000) ack 1 win 65535 +.05 < . 1:1(0) ack 4001 win 1001 and tcpdump show: 192.168.226.67.55 > 192.0.2.1.8080: Flags [.], seq 1:2001, ack 1, win 65535, length 2000 192.168.226.67.55 > 192.0.2.1.8080: Flags [.], seq 2001:4001, ack 1, win 65535, length 2000 192.168.226.67.55 > 192.0.2.1.8080: Flags [P.], seq 4001:5001, ack 1, win 65535, length 1000 192.168.226.67.55 > 192.0.2.1.8080: Flags [.], seq 5001:6001, ack 1, win 65535, length 1000 192.0.2.1.8080 > 192.168.226.67.55: Flags [.], ack 4001, win 1001, length 0 192.168.226.67.55 > 192.0.2.1.8080: Flags [.], seq 5001:6001, ack 1, win 65535, length 1000 192.168.226.67.55 > 192.0.2.1.8080: Flags [P.], seq 4001:5001, ack 1, win 65535, length 1000 when cient retract window to 1001, send window is [4001,5002], but TLP send 5001-6001 packet which is out of send window. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/1657532838-20200-1-git-send-email-liyonglong@chinatelecom.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* nfp: support TX VLAN ctag insert in NFDKDiana Wang2022-07-123-31/+35
| | | | | | | | | | | | | | | | | | | | | | | | Add support for TX VLAN ctag insert which may be configured via ethtool. e.g. # ethtool -K $DEV tx-vlan-offload on The NIC supplies VLAN insert information as packet metadata. The fields of this VLAN metadata including vlan_proto and vlan tag. Configuration control bit NFP_NET_CFG_CTRL_TXVLAN_V2 is to signal availability of ctag-insert features of the firmware. NFDK is used to communicate via PCIE to NFP-3800 based NICs while NFD3 is used for other NICs supported by the NFP driver. This features is currently implemented only for NFD3 and this patch adds support for it with NFDK. Signed-off-by: Diana Wang <na.wang@corigine.com> Reviewed-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20220711093048.1911698-1-simon.horman@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* nfp: fix clang -Wformat warningsJustin Stitt2022-07-122-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | When building with Clang we encounter these warnings: | drivers/net/ethernet/netronome/nfp/nfp_app.c:233:99: error: format | specifies type 'unsigned char' but the argument has underlying type | 'unsigned int' [-Werror,-Wformat] nfp_err(pf->cpp, "unknown FW app ID | 0x%02hhx, driver too old or support for FW not built in\n", id); - | drivers/net/ethernet/netronome/nfp/nfp_main.c:396:11: error: format | specifies type 'unsigned char' but the argument has type 'int' | [-Werror,-Wformat] serial, interface >> 8, interface & 0xff); Correct format specifier for `id` is `%x` since the default type for the `nfp_app_id` enum is `unsigned int`. The second warning is also solved by using the `%x` format specifier as the expressions involving `interface` are implicity promoted to integers (%x is used to maintain hexadecimal representation). Link: https://github.com/ClangBuiltLinux/linux/issues/378 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20220712000152.2292031-1-justinstitt@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge branch 'dt-bindings-net-convert-sff-sfp-to-dtschema'Jakub Kicinski2022-07-1214-147/+205
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ioana Ciornei says: ==================== dt-bindings: net: convert sff,sfp to dtschema This patch set converts the sff,sfp to dtschema. The first patch does a somewhat mechanical conversion without changing anything else beside the format in which the dt binding is presented. In the second patch we rename some dt nodes to be generic. The last two patches change the GPIO related properties so that they uses the -gpios preferred suffix. This way, all the DTBs are passing the validation against the sff,sfp.yaml binding. ==================== Link: https://lore.kernel.org/r/20220707091437.446458-1-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * arch: arm64: dts: marvell: rename the sfp GPIO propertiesIoana Ciornei2022-07-1210-58/+58
| | | | | | | | | | | | | | | | | | Rename the GPIO related sfp properties to include the preffered -gpios suffix. Also, with this change the dtb_check will no longer complain when trying to verify the DTS against the sff,sfp.yaml binding. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * arch: arm64: dts: lx2160a-clearfog-itx: rename the sfp GPIO propertiesIoana Ciornei2022-07-121-4/+4
| | | | | | | | | | | | | | | | | | | | Rename the 'mod-def0-gpio' property to 'mod-def0-gpios' so that we use the preferred -gpios suffix. Also, with this change the dtb_check will not complain when trying to verify the DTS against the sff,sfp.yaml binding. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * dt-bindings: net: sff,sfp: rename example dt nodes to be more genericIoana Ciornei2022-07-121-9/+9
| | | | | | | | | | | | | | | | | | Rename the dt nodes shown in the sff,sfp.yaml examples so that they are generic and not really tied to a specific platform. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * dt-bindings: net: convert sff,sfp to dtschemaIoana Ciornei2022-07-123-85/+143
|/ | | | | | | | | Convert the sff,sfp.txt bindings to the DT schema format. Also add the new path to the list of maintained files. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: change the type of ip_route_input_rcu to staticZhengchao Shao2022-07-122-21/+17
| | | | | | | | The type of ip_route_input_rcu should be static. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20220711073549.8947-1-shaozhengchao@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* Merge branch 'mlx5-devlink-mutex-removal-part-1'Paolo Abeni2022-07-128-119/+94
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Moshe Shemesh Says: =================== 1) Fix devlink lock in mlx5 devlink eswitch callbacks Following the commit 14e426bf1a4d "devlink: hold the instance lock during eswitch_mode callbacks" which takes devlink instance lock for all devlink eswitch callbacks and adds a temporary workaround, this patchset removes the workaround, replaces devlink API functions by devl_ API where called from mlx5 driver eswitch callbacks flows and adds devlink instance lock in other driver's path that leads to these functions. While moving to devl_ API the patchset removes part of the devlink API functions which mlx5 was the last one to use and so not used by any driver now. The patchset also remove DEVLINK_NL_FLAG_NO_LOCK flag from the callbacks of port_new/port which are called only from mlx5 driver and the already locked by the patchset as parallel paths to the eswitch callbacks using devl_ API functions. This patchset will be followed by another patchset that will remove DEVLINK_NL_FLAG_NO_LOCK flag from devlink reload and devlink health callbacks. Thus we will have all devlink callbacks locked and it will pave the way to remove devlink mutex. =================== Link: https://lore.kernel.org/r/20220711081408.69452-1-saeed@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * devlink: Hold the instance lock in port_new / port_del callbacksMoshe Shemesh2022-07-122-9/+1
| | | | | | | | | | | | | | | | | | | | | | Let the core take the devlink instance lock around port_new and port_del callbacks and remove the now redundant locking in the only driver that currently use them. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * net/mlx5: Remove devl_unlock from mlx5_devlink_eswitch_mode_setMoshe Shemesh2022-07-121-20/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | The callback mlx5_devlink_eswitch_mode_set() had unlocked devlink as a temporary workaround once devlink instance lock was added to devlink eswitch callbacks. Now that all flows triggered by this function that took devlink lock are using devl_ API and all parallel paths are locked we can remove this workaround. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * net/mlx5: Use devl_ API in mlx5e_devlink_port_registerMoshe Shemesh2022-07-124-4/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of the flows invoked by mlx5_devlink_eswitch_mode_set() get to mlx5_rescan_drivers_locked() which can call mlx5e_probe()/mlx5e_remove and register/unregister mlx5e driver ports accordingly. This can lead to deadlock once mlx5_devlink_eswitch_mode_set() will use devlink lock. Use devl_port_register/unregister() instead of devlink_port_register/unregister() and add devlink instance locks in the driver paths to this function to have it locked while calling devl_ API function. If remove or probe were called by module init or module cleanup flows, need to lock devlink just before calling devl_port_register(), otherwise it is called by attach/detach or register/unregister flow and we can have the flow locked. Added flag to distinguish between these cases. This will be used by the downstream patch to invoke mlx5_devlink_eswitch_mode_set() with devlink locked. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * devlink: Remove unused functions devlink_rate_leaf_create/destroyMoshe Shemesh2022-07-122-37/+7
| | | | | | | | | | | | | | | | | | | | | | The previous patch removed the last usage of the functions devlink_rate_leaf_create() and devlink_rate_nodes_destroy(). Thus, remove these function from devlink API. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * net/mlx5: Use devl_ API in mlx5_esw_devlink_sf_port_registerMoshe Shemesh2022-07-122-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function mlx5_esw_devlink_sf_port_register() calls devlink_port_register() and devlink_rate_leaf_create(). Use devl_ API to call devl_port_register() and devl_rate_leaf_create() accordingly and add devlink instance lock in driver paths to this function. Similarly, use devl_ API to call devl_port_unregister() and devl_rate_leaf_destroy() in mlx5_esw_devlink_sf_port_unregister() and ensure locking devlink instance lock on all the paths to this function too. This will be used by the downstream patch to invoke mlx5_devlink_eswitch_mode_set() with devlink lock held. Note this patch is taking devlink lock on mlx5_devlink_sf_port_new/del() which are devlink callbacks for port_new/del(). We will take these locks off once these callbacks will be locked by devlink too. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * net/mlx5: Use devl_ API in mlx5_esw_offloads_devlink_port_registerMoshe Shemesh2022-07-123-5/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function mlx5_esw_offloads_devlink_port_register() calls devlink_port_register() and devlink_rate_leaf_create(). Use devl_ API to call devl_port_register() and devl_rate_leaf_create() accordingly and add devlink instance lock in driver paths to this function. Similarly, use devl_ API to call devl_port_unregister() and devl_rate_leaf_destroy() in mlx5_esw_offloads_devlink_port_unregister() and ensure locking devlink instance lock on the paths to this function too. This will be used by the downstream patch to invoke mlx5_devlink_eswitch_mode_set() with devlink lock held. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * devlink: Remove unused function devlink_rate_nodes_destroyMoshe Shemesh2022-07-122-19/+0
| | | | | | | | | | | | | | | | | | | | | | The previous patch removed the last usage of the function devlink_rate_nodes_destroy(). Thus, remove this function from devlink API. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * net/mlx5: Use devl_ API for rate nodes destroyMoshe Shemesh2022-07-122-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Use devl_rate_nodes_destroy() instead of devlink_rate_nodes_destroy(). Add devlink instance lock in the driver paths to this function to have it locked while calling devl_ API function. This will be used by the downstream patch to invoke mlx5_devlink_eswitch_mode_set() with devlink lock held. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * net/mlx5: Remove devl_unlock from mlx5_eswtich_mode_callback_enterMoshe Shemesh2022-07-121-32/+11
|/ | | | | | | | | | | | | | | | | | | | The function mlx5_eswtich_mode_callback_enter() was added as a temporary workaround once devlink instance lock was added to devlink eswitch callbacks. However, code review and testing show that all the callbacks part to eswitch_mode_set don't take devlink instance lock in any flow and so unlocking devlink instance lock while entering these functions is not needed. Remove devl_lock from mlx5_eswtich_mode_callback_enter() and devl_unlock from mlx5_eswtich_mode_callback_exit(). Also remove the functions mlx5_eswtich_mode_callback_enter()/exit() as they are not needed any more. The callback eswitch_mode_set will be treated separately in the following patches. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* amd-xgbe: fix clang -Wformat warningsJustin Stitt2022-07-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When building with Clang we encounter the following warning: | drivers/net/ethernet/amd/xgbe/xgbe-dcb.c:234:42: error: format specifies | type 'unsigned char' but the argument has type '__u16' (aka 'unsigned | short') [-Werror,-Wformat] pfc->pfc_cap, pfc->pfc_en, pfc->mbc, | pfc->delay); pfc->pfc_cap , pfc->pfc_cn, pfc->mbc are all of type `u8` while pfc->delay is of type `u16`. The correct format specifiers `%hh[u|x]` were used for the first three but not for pfc->delay, which is causing the warning above. Variadic functions (printf-like) undergo default argument promotion. Documentation/core-api/printk-formats.rst specifically recommends using the promoted-to-type's format flag. In this case `%d` (or `%x` to maintain hex representation) should be used since both u8's and u16's are fully representable by an int. Moreover, C11 6.3.1.1 states: (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf) `If an int can represent all values of the original type ..., the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.` Link: https://github.com/ClangBuiltLinux/linux/issues/378 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20220708232653.556488-1-justinstitt@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* atm: he: Use the bitmap API to allocate bitmapsChristophe JAILLET2022-07-111-6/+3
| | | | | | | | | | Use bitmap_zalloc()/bitmap_free() instead of hand-writing them. It is less verbose and it improves the semantic. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/7f795bd6d5b2a00f581175b7069b229c2e5a4192.1657379127.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net/fq_impl: Use the bitmap API to allocate bitmapsChristophe JAILLET2022-07-111-3/+2
| | | | | | | | | | Use bitmap_zalloc()/bitmap_free() instead of hand-writing them. It is less verbose and it improves the semantic. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/c7bf099af07eb497b02d195906ee8c11fea3b3bd.1657377335.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: dsa: hellcreek: Use the bitmap API to allocate bitmapsChristophe JAILLET2022-07-111-5/+2
| | | | | | | | | | | Use devm_bitmap_zalloc() instead of hand-writing them. It is less verbose and it improves the semantic. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Kurt Kanzenbach <kurt@linutronix.de> Link: https://lore.kernel.org/r/8306e2ae69a5d8553691f5d10a86a4390daf594b.1657376651.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge branch 'tls-rx-follow-ups-to-nopad'Jakub Kicinski2022-07-116-8/+66
|\ | | | | | | | | | | | | | | | | | | | | | | | | Jakub Kicinski says: ==================== tls: rx: follow-ups to NoPad A few fixes for issues spotted by Maxim. ==================== Link: https://lore.kernel.org/r/20220709025255.323864-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: tls: add test for NoPad getsockoptJakub Kicinski2022-07-111-0/+51
| | | | | | | | | | | | | | Make sure setsockopt / getsockopt behave as expected. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * tls: rx: fix the NoPad getsockoptJakub Kicinski2022-07-111-5/+4
| | | | | | | | | | | | | | | | | | | | Maxim reports do_tls_getsockopt_no_pad() will always return an error. Indeed looks like refactoring gone wrong - remove err and use value. Reported-by: Maxim Mikityanskiy <maximmi@nvidia.com> Fixes: 88527790c079 ("tls: rx: add sockopt for enabling optimistic decrypt with TLS 1.3") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * tls: rx: add counter for NoPad violationsJakub Kicinski2022-07-114-0/+8
| | | | | | | | | | | | | | | | | | As discussed with Maxim add a counter for true NoPad violations. This should help deployments catch unexpected padded records vs just control records which always need re-encryption. https: //lore.kernel.org/all/b111828e6ac34baad9f4e783127eba8344ac252d.camel@nvidia.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * tls: fix spelling of MIBJakub Kicinski2022-07-113-3/+3
|/ | | | | | | MIN -> MIB Fixes: 88527790c079 ("tls: rx: add sockopt for enabling optimistic decrypt with TLS 1.3") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* bcm63xx: fix Tx cleanup when NAPI poll budget is zeroSieng-Piaw Liew2022-07-111-5/+5
| | | | | | | | | | | | NAPI poll() function may be passed a budget value of zero, i.e. during netpoll, which isn't NAPI context. Therefore, napi_consume_skb() must be given budget value instead of !force to truly discern netpoll-like scenarios. Fixes: c63c615e22eb ("bcm63xx_enet: switch to napi_build_skb() to reuse skbuff_heads") Signed-off-by: Sieng-Piaw Liew <liew.s.piaw@gmail.com> Link: https://lore.kernel.org/r/20220708080303.298-1-liew.s.piaw@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge branch 'octeontx2-exact-match-table'Jakub Kicinski2022-07-1120-67/+2901
|\ | | | | | | | | | | | | | | | | | | | | | | | | Ratheesh Kannoth says: ==================== octeontx2: Exact Match Table. Exact match table and Field hash support for CN10KB silicon ==================== Link: https://lore.kernel.org/r/20220708044151.2972645-1-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * octeontx2-af: Enable Exact match flag in kex profileRatheesh Kannoth2022-07-111-2/+3
| | | | | | | | | | | | | | | | | | Enabled EXACT match flag in Kex default profile. Since there is no space in key, NPC_PARSE_NIBBLE_ERRCODE is removed Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * octeontx2-pf: Add support for exact match table.Ratheesh Kannoth2022-07-114-29/+67
| | | | | | | | | | | | | | | | | | NPC exact match table can support more entries than RPM dmac filters. This requires field size of DMAC filter count and index to be increased. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * octeontx2-af: Invoke exact match functions if supportedRatheesh Kannoth2022-07-112-0/+41
| | | | | | | | | | | | | | | | If exact match table is supported, call functions to add/del/update entries in exact match table instead of RPM dmac filters Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * octeontx2-af: Wrapper functions for MAC addr add/del/update/resetRatheesh Kannoth2022-07-112-12/+370
| | | | | | | | | | | | | | | | | | These functions are wrappers for mac add/addr/del/update in exact match table. These will be invoked from mbox handler routines if exact matct table is supported and enabled. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * octeontx2: Modify mbox request and response structuresRatheesh Kannoth2022-07-113-9/+24
| | | | | | | | | | | | | | | | | | | | Exact match table modification requires wider fields as it has more number of slots to fill in. Modifying an entry in exact match table may cause hash collision and may be required to delete entry from 4-way 2K table and add to fully associative 32 entry CAM table. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * octeontx2-af: Debugsfs support for exact match.Ratheesh Kannoth2022-07-111-0/+179
| | | | | | | | | | | | | | | | | | | | There debugfs files created. 1. General information on exact match table 2. Exact match table entries. 3. NPC mcam drop on hit count stats. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * octeontx2-af: Drop rules for NPC MCAMRatheesh Kannoth2022-07-116-6/+481
| | | | | | | | | | | | | | | | | | | | | | NPC exact match table installs drop on hit rules in NPC mcam for each channel. This rule has broadcast and multicast bits cleared. Exact match bit cleared and channel bits set. If exact match table hit bit is 0, corresponding NPC mcam drop rule will be hit for the packet and will be dropped. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * octeontx2-af: FLR handler for exact match table.Ratheesh Kannoth2022-07-113-0/+31
| | | | | | | | | | | | | | | | FLR handler should remove/free all exact match table resources corresponding to each interface. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * octeontx2-af: devlink configuration supportRatheesh Kannoth2022-07-113-2/+101
| | | | | | | | | | | | | | | | | | | | CN10KB silicon supports Exact match feature. This feature can be disabled through devlink configuration. Devlink command fails if DMAC filter rules are already present. Once disabled, legacy RPM based DMAC filters will be configured. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * octeontx2-af: Exact match scan from kex profileRatheesh Kannoth2022-07-112-3/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | CN10KB silicon supports exact match table. Scanning KEX profile should check for exact match feature is enabled and then set profile masks properly. These kex profile masks are required to configure NPC MCAM drop rules. If there is a miss in exact match table, these drop rules will drop those packets. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * octeontx2-af: Exact match supportRatheesh Kannoth2022-07-116-1/+1029
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CN10KB silicon has support for exact match table. This table can be used to match maimum 64 bit value of KPU parsed output. Hit/non hit in exact match table can be used as a KEX key to NPC mcam. This patch makes use of Exact match table to increase number of DMAC filters supported. NPC mcam is no more need for each of these DMAC entries as will be populated in Exact match table. This patch implements following 1. Initialization of exact match table only for CN10KB. 2. Add/del/update interface function for exact match table. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * octeontx2-af: Use hashed field in MCAM keyRatheesh Kannoth2022-07-1112-17/+554
|/ | | | | | | | | | | | | | CN10KB variant of CN10K series of silicons supports a new feature where in a large protocol field (eg 128bit IPv6 DIP) can be condensed into a small hashed 32bit data. This saves a lot of space in MCAM key and allows user to add more protocol fields into the filter. A max of two such protocol data can be hashed. This patch adds support for hashing IPv6 SIP and/or DIP. Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* fddi/skfp: fix repeated words in commentsJilin Yuan2022-07-111-1/+1
| | | | | | | Delete the redundant word 'test'. Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethernet/via: fix repeated words in commentsJilin Yuan2022-07-111-1/+1
| | | | | | | Delete the redundant word 'driver'. Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Find dst with sk's xfrm policy not ctl_sksewookseo2022-07-114-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we set XFRM security policy by calling setsockopt with option IPV6_XFRM_POLICY, the policy will be stored in 'sock_policy' in 'sock' struct. However tcp_v6_send_response doesn't look up dst_entry with the actual socket but looks up with tcp control socket. This may cause a problem that a RST packet is sent without ESP encryption & peer's TCP socket can't receive it. This patch will make the function look up dest_entry with actual socket, if the socket has XFRM policy(sock_policy), so that the TCP response packet via this function can be encrypted, & aligned on the encrypted TCP socket. Tested: We encountered this problem when a TCP socket which is encrypted in ESP transport mode encryption, receives challenge ACK at SYN_SENT state. After receiving challenge ACK, TCP needs to send RST to establish the socket at next SYN try. But the RST was not encrypted & peer TCP socket still remains on ESTABLISHED state. So we verified this with test step as below. [Test step] 1. Making a TCP state mismatch between client(IDLE) & server(ESTABLISHED). 2. Client tries a new connection on the same TCP ports(src & dst). 3. Server will return challenge ACK instead of SYN,ACK. 4. Client will send RST to server to clear the SOCKET. 5. Client will retransmit SYN to server on the same TCP ports. [Expected result] The TCP connection should be established. Cc: Maciej Żenczykowski <maze@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Sehee Lee <seheele@google.com> Signed-off-by: Sewook Seo <sewookseo@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski2022-07-09125-6701/+5141
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2022-07-09 We've added 94 non-merge commits during the last 19 day(s) which contain a total of 125 files changed, 5141 insertions(+), 6701 deletions(-). The main changes are: 1) Add new way for performing BTF type queries to BPF, from Daniel Müller. 2) Add inlining of calls to bpf_loop() helper when its function callback is statically known, from Eduard Zingerman. 3) Implement BPF TCP CC framework usability improvements, from Jörn-Thorben Hinz. 4) Add LSM flavor for attaching per-cgroup BPF programs to existing LSM hooks, from Stanislav Fomichev. 5) Remove all deprecated libbpf APIs in prep for 1.0 release, from Andrii Nakryiko. 6) Add benchmarks around local_storage to BPF selftests, from Dave Marchevsky. 7) AF_XDP sample removal (given move to libxdp) and various improvements around AF_XDP selftests, from Magnus Karlsson & Maciej Fijalkowski. 8) Add bpftool improvements for memcg probing and bash completion, from Quentin Monnet. 9) Add arm64 JIT support for BPF-2-BPF coupled with tail calls, from Jakub Sitnicki. 10) Sockmap optimizations around throughput of UDP transmissions which have been improved by 61%, from Cong Wang. 11) Rework perf's BPF prologue code to remove deprecated functions, from Jiri Olsa. 12) Fix sockmap teardown path to avoid sleepable sk_psock_stop, from John Fastabend. 13) Fix libbpf's cleanup around legacy kprobe/uprobe on error case, from Chuang Wang. 14) Fix libbpf's bpf_helpers.h to work with gcc for the case of its sec/pragma macro, from James Hilliard. 15) Fix libbpf's pt_regs macros for riscv to use a0 for RC register, from Yixun Lan. 16) Fix bpftool to show the name of type BPF_OBJ_LINK, from Yafang Shao. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (94 commits) selftests/bpf: Fix xdp_synproxy build failure if CONFIG_NF_CONNTRACK=m/n bpf: Correctly propagate errors up from bpf_core_composites_match libbpf: Disable SEC pragma macro on GCC bpf: Check attach_func_proto more carefully in check_return_code selftests/bpf: Add test involving restrict type qualifier bpftool: Add support for KIND_RESTRICT to gen min_core_btf command MAINTAINERS: Add entry for AF_XDP selftests files selftests, xsk: Rename AF_XDP testing app bpf, docs: Remove deprecated xsk libbpf APIs description selftests/bpf: Add benchmark for local_storage RCU Tasks Trace usage libbpf, riscv: Use a0 for RC register libbpf: Remove unnecessary usdt_rel_ip assignments selftests/bpf: Fix few more compiler warnings selftests/bpf: Fix bogus uninitialized variable warning bpftool: Remove zlib feature test from Makefile libbpf: Cleanup the legacy uprobe_event on failed add/attach_event() libbpf: Fix wrong variable used in perf_event_uprobe_open_legacy() libbpf: Cleanup the legacy kprobe_event on failed add/attach_event() selftests/bpf: Add type match test against kernel's task_struct selftests/bpf: Add nested type to type based tests ... ==================== Link: https://lore.kernel.org/r/20220708233145.32365-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests/bpf: Fix xdp_synproxy build failure if CONFIG_NF_CONNTRACK=m/nMaxim Mikityanskiy2022-07-081-7/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_NF_CONNTRACK=m, struct bpf_ct_opts and enum member BPF_F_CURRENT_NETNS are not exposed. This commit allows building the xdp_synproxy selftest in such cases. Note that nf_conntrack must be loaded before running the test if it's compiled as a module. This commit also allows this selftest to be successfully compiled when CONFIG_NF_CONNTRACK is disabled. One unused local variable of type struct bpf_ct_opts is also removed. Fixes: fb5cd0ce70d4 ("selftests/bpf: Add selftests for raw syncookie helpers") Reported-by: Yauheni Kaliuta <ykaliuta@redhat.com> Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220708130319.1016294-1-maximmi@nvidia.com
| * bpf: Correctly propagate errors up from bpf_core_composites_matchDaniel Müller2022-07-081-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change addresses a comment made earlier [0] about a missing return of an error when __bpf_core_types_match is invoked from bpf_core_composites_match, which could have let to us erroneously ignoring errors. Regarding the typedef name check pointed out in the same context, it is not actually an issue, because callers of the function perform a name check for the root type anyway. To make that more obvious, let's add comments to the function (similar to what we have for bpf_core_types_are_compat, which is called in pretty much the same context). [0]: https://lore.kernel.org/bpf/165708121449.4919.13204634393477172905.git-patchwork-notify@kernel.org/T/#m55141e8f8cfd2e8d97e65328fa04852870d01af6 Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220707211931.3415440-1-deso@posteo.net