linux-stable.git - Linux kernel stable tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	net: enetc: survive memory pressure without crashing	Vladimir Oltean	2022-10-27	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Under memory pressure, enetc_refill_rx_ring() may fail, and when called during the enetc_open() -> enetc_setup_rxbdr() procedure, this is not checked for. An extreme case of memory pressure will result in exactly zero buffers being allocated for the RX ring, and in such a case it is expected that hardware drops all RX packets due to lack of buffers. This does not happen, because the reset-default value of the consumer and produces index is 0, and this makes the ENETC think that all buffers have been initialized and that it owns them (when in reality none were). The hardware guide explains this best: \| Configure the receive ring producer index register RBaPIR with a value \| of 0. The producer index is initially configured by software but owned \| by hardware after the ring has been enabled. Hardware increments the \| index when a frame is received which may consume one or more BDs. \| Hardware is not allowed to increment the producer index to match the \| consumer index since it is used to indicate an empty condition. The ring \| can hold at most RBLENR[LENGTH]-1 received BDs. \| \| Configure the receive ring consumer index register RBaCIR. The \| consumer index is owned by software and updated during operation of the \| of the BD ring by software, to indicate that any receive data occupied \| in the BD has been processed and it has been prepared for new data. \| - If consumer index and producer index are initialized to the same \| value, it indicates that all BDs in the ring have been prepared and \| hardware owns all of the entries. \| - If consumer index is initialized to producer index plus N, it would \| indicate N BDs have been prepared. Note that hardware cannot start if \| only a single buffer is prepared due to the restrictions described in \| (2). \| - Software may write consumer index to match producer index anytime \| while the ring is operational to indicate all received BDs prior have \| been processed and new BDs prepared for hardware. Normally, the value of rx_ring->rcir (consumer index) is brought in sync with the rx_ring->next_to_use software index, but this only happens if page allocation ever succeeded. When PI==CI==0, the hardware appears to receive frames and write them to DMA address 0x0 (?!), then set the READY bit in the BD. The enetc_clean_rx_ring() function (and its XDP derivative) is naturally not prepared to handle such a condition. It will attempt to process those frames using the rx_swbd structure associated with index i of the RX ring, but that structure is not fully initialized (enetc_new_page() does all of that). So what happens next is undefined behavior. To operate using no buffer, we must initialize the CI to PI + 1, which will block the hardware from advancing the CI any further, and drop everything. The issue was seen while adding support for zero-copy AF_XDP sockets, where buffer memory comes from user space, which can even decide to supply no buffers at all (example: "xdpsock --txonly"). However, the bug is present also with the network stack code, even though it would take a very determined person to trigger a page allocation failure at the perfect time (a series of ifup/ifdown under memory pressure should eventually reproduce it given enough retries). Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20221027182925.3256653-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5e: Fix macsec sci endianness at rx sa update	Raed Salem	2022-10-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cited commit at rx sa update operation passes the sci object attribute, in the wrong endianness and not as expected by the HW effectively create malformed hw sa context in case of update rx sa consequently, HW produces unexpected MACsec packets which uses this sa. Fix by passing sci to create macsec object with the correct endianness, while at it add __force u64 to prevent sparse check error of type "sparse: error: incorrect type in assignment". Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-16-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5e: Fix wrong bitwise comparison usage in macsec_fs_rx_add_rule function	Raed Salem	2022-10-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cited commit produces a sparse check error of type "sparse: error: restricted __be64 degrades to integer". The offending line wrongly did a bitwise operation between two different storage types one of 64 bit when the other smaller side is 16 bit which caused the above sparse error, furthermore bitwise operation usage here is wrong in the first place as the constant MACSEC_PORT_ES is not a bitwise field. Fix by using the right mask to get the lower 16 bit if the sci number, and use comparison operator '==' instead of bitwise '&' operator. Fixes: 3b20949cb21b ("net/mlx5e: Add MACsec RX steering rules") Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-15-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5e: Fix macsec rx security association (SA) update/delete	Raed Salem	2022-10-27	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cited commit adds the support for update/delete MACsec Rx SA, naturally, these operations need to check if the SA in question exists to update/delete the SA and return error code otherwise, however they do just the opposite i.e. return with error if the SA exists Fix by change the check to return error in case the SA in question does not exist, adjust error message and code accordingly. Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support") Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-14-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5e: Fix macsec coverity issue at rx sa update	Raed Salem	2022-10-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cited commit at update rx sa operation passes object attributes to MACsec object create function without initializing/setting all attributes fields leaving some of them with garbage values, therefore violating the implicit assumption at create object function, which assumes that all input object attributes fields are set. Fix by initializing the object attributes struct to zero, thus leaving unset fields with the legal zero value. Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Lior Nahmanson <liorna@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-13-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5: Fix crash during sync firmware reset	Suresh Devarakonda	2022-10-27	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When setting Bluefield to DPU NIC mode using mlxconfig tool + sync firmware reset flow, we run into scenario where the host was not eswitch manager at the time of mlx5 driver load but becomes eswitch manager after the sync firmware reset flow. This results in null pointer access of mpfs structure during mac filter add. This change prevents null pointer access but mpfs table entries will not be added. Fixes: 5ec697446f46 ("net/mlx5: Add support for devlink reload action fw activate") Signed-off-by: Suresh Devarakonda <ramad@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Bodong Wang <bodong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-12-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5: Update fw fatal reporter state on PCI handlers successful recover	Roy Novich	2022-10-27	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update devlink health fw fatal reporter state to "healthy" is needed by strictly calling devlink_health_reporter_state_update() after recovery was done by PCI error handler. This is needed when fw_fatal reporter was triggered due to PCI error. Poll health is called and set reporter state to error. Health recovery failed (since EEH didn't re-enable the PCI). PCI handlers keep on recover flow and succeed later without devlink acknowledgment. Fix this by adding devlink state update at the end of the PCI handler recovery process. Fixes: 6181e5cb752e ("devlink: add support for reporter recovery completion") Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-11-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroed	Roi Dayan	2022-10-27	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	On multi table split the driver creates a new attr instance with data being copied from prev attr instance zeroing action flags. Also need to reset dests properties to avoid incorrect dests per attr. Fixes: 8300f225268b ("net/mlx5e: Create new flow attr for multi table actions") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-10-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5e: TC, Reject forwarding from internal port to internal port	Ariel Levkovich	2022-10-27	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reject TC rules that forward from internal port to internal port as it is not supported. This include rules that are explicitly have internal port as the filter device as well as rules that apply on tunnel interfaces as the route device for the tunnel interface can be an internal port. Fixes: 27484f7170ed ("net/mlx5e: Offload tc rules that redirect to ovs internal port") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-9-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5: Fix possible use-after-free in async command interface	Tariq Toukan	2022-10-27	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mlx5_cmd_cleanup_async_ctx should return only after all its callback handlers were completed. Before this patch, the below race between mlx5_cmd_cleanup_async_ctx and mlx5_cmd_exec_cb_handler was possible and lead to a use-after-free: 1. mlx5_cmd_cleanup_async_ctx is called while num_inflight is 2 (i.e. elevated by 1, a single inflight callback). 2. mlx5_cmd_cleanup_async_ctx decreases num_inflight to 1. 3. mlx5_cmd_exec_cb_handler is called, decreases num_inflight to 0 and is about to call wake_up(). 4. mlx5_cmd_cleanup_async_ctx calls wait_event, which returns immediately as the condition (num_inflight == 0) holds. 5. mlx5_cmd_cleanup_async_ctx returns. 6. The caller of mlx5_cmd_cleanup_async_ctx frees the mlx5_async_ctx object. 7. mlx5_cmd_exec_cb_handler goes on and calls wake_up() on the freed object. Fix it by syncing using a completion object. Mark it completed when num_inflight reaches 0. Trace: BUG: KASAN: use-after-free in do_raw_spin_lock+0x23d/0x270 Read of size 4 at addr ffff888139cd12f4 by task swapper/5/0 CPU: 5 PID: 0 Comm: swapper/5 Not tainted 6.0.0-rc3_for_upstream_debug_2022_08_30_13_10 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <IRQ> dump_stack_lvl+0x57/0x7d print_report.cold+0x2d5/0x684 ? do_raw_spin_lock+0x23d/0x270 kasan_report+0xb1/0x1a0 ? do_raw_spin_lock+0x23d/0x270 do_raw_spin_lock+0x23d/0x270 ? rwlock_bug.part.0+0x90/0x90 ? __delete_object+0xb8/0x100 ? lock_downgrade+0x6e0/0x6e0 _raw_spin_lock_irqsave+0x43/0x60 ? __wake_up_common_lock+0xb9/0x140 __wake_up_common_lock+0xb9/0x140 ? __wake_up_common+0x650/0x650 ? destroy_tis_callback+0x53/0x70 [mlx5_core] ? kasan_set_track+0x21/0x30 ? destroy_tis_callback+0x53/0x70 [mlx5_core] ? kfree+0x1ba/0x520 ? do_raw_spin_unlock+0x54/0x220 mlx5_cmd_exec_cb_handler+0x136/0x1a0 [mlx5_core] ? mlx5_cmd_cleanup_async_ctx+0x220/0x220 [mlx5_core] ? mlx5_cmd_cleanup_async_ctx+0x220/0x220 [mlx5_core] mlx5_cmd_comp_handler+0x65a/0x12b0 [mlx5_core] ? dump_command+0xcc0/0xcc0 [mlx5_core] ? lockdep_hardirqs_on_prepare+0x400/0x400 ? cmd_comp_notifier+0x7e/0xb0 [mlx5_core] cmd_comp_notifier+0x7e/0xb0 [mlx5_core] atomic_notifier_call_chain+0xd7/0x1d0 mlx5_eq_async_int+0x3ce/0xa20 [mlx5_core] atomic_notifier_call_chain+0xd7/0x1d0 ? irq_release+0x140/0x140 [mlx5_core] irq_int_handler+0x19/0x30 [mlx5_core] __handle_irq_event_percpu+0x1f2/0x620 handle_irq_event+0xb2/0x1d0 handle_edge_irq+0x21e/0xb00 __common_interrupt+0x79/0x1a0 common_interrupt+0x78/0xa0 </IRQ> <TASK> asm_common_interrupt+0x22/0x40 RIP: 0010:default_idle+0x42/0x60 Code: c1 83 e0 07 48 c1 e9 03 83 c0 03 0f b6 14 11 38 d0 7c 04 84 d2 75 14 8b 05 eb 47 22 02 85 c0 7e 07 0f 00 2d e0 9f 48 00 fb f4 <c3> 48 c7 c7 80 08 7f 85 e8 d1 d3 3e fe eb de 66 66 2e 0f 1f 84 00 RSP: 0018:ffff888100dbfdf0 EFLAGS: 00000242 RAX: 0000000000000001 RBX: ffffffff84ecbd48 RCX: 1ffffffff0afe110 RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff835cc9bc RBP: 0000000000000005 R08: 0000000000000001 R09: ffff88881dec4ac3 R10: ffffed1103bd8958 R11: 0000017d0ca571c9 R12: 0000000000000005 R13: ffffffff84f024e0 R14: 0000000000000000 R15: dffffc0000000000 ? default_idle_call+0xcc/0x450 default_idle_call+0xec/0x450 do_idle+0x394/0x450 ? arch_cpu_idle_exit+0x40/0x40 ? do_idle+0x17/0x450 cpu_startup_entry+0x19/0x20 start_secondary+0x221/0x2b0 ? set_cpu_sibling_map+0x2070/0x2070 secondary_startup_64_no_verify+0xcd/0xdb </TASK> Allocated by task 49502: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0x81/0xa0 kvmalloc_node+0x48/0xe0 mlx5e_bulk_async_init+0x35/0x110 [mlx5_core] mlx5e_tls_priv_tx_list_cleanup+0x84/0x3e0 [mlx5_core] mlx5e_ktls_cleanup_tx+0x38f/0x760 [mlx5_core] mlx5e_cleanup_nic_tx+0xa7/0x100 [mlx5_core] mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core] mlx5e_suspend+0xdb/0x140 [mlx5_core] mlx5e_remove+0x89/0x190 [mlx5_core] auxiliary_bus_remove+0x52/0x70 device_release_driver_internal+0x40f/0x650 driver_detach+0xc1/0x180 bus_remove_driver+0x125/0x2f0 auxiliary_driver_unregister+0x16/0x50 mlx5e_cleanup+0x26/0x30 [mlx5_core] cleanup+0xc/0x4e [mlx5_core] __x64_sys_delete_module+0x2b5/0x450 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Freed by task 49502: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_set_free_info+0x20/0x30 ____kasan_slab_free+0x11d/0x1b0 kfree+0x1ba/0x520 mlx5e_tls_priv_tx_list_cleanup+0x2e7/0x3e0 [mlx5_core] mlx5e_ktls_cleanup_tx+0x38f/0x760 [mlx5_core] mlx5e_cleanup_nic_tx+0xa7/0x100 [mlx5_core] mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core] mlx5e_suspend+0xdb/0x140 [mlx5_core] mlx5e_remove+0x89/0x190 [mlx5_core] auxiliary_bus_remove+0x52/0x70 device_release_driver_internal+0x40f/0x650 driver_detach+0xc1/0x180 bus_remove_driver+0x125/0x2f0 auxiliary_driver_unregister+0x16/0x50 mlx5e_cleanup+0x26/0x30 [mlx5_core] cleanup+0xc/0x4e [mlx5_core] __x64_sys_delete_module+0x2b5/0x450 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: e355477ed9e4 ("net/mlx5: Make mlx5_cmd_exec_cb() a safe API") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-8-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5: ASO, Create the ASO SQ with the correct timestamp format	Saeed Mahameed	2022-10-27	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mlx5 SQs must select the timestamp format explicitly according to the active clock mode, select the current active timestamp mode so ASO SQ create will succeed. This fixes the following error prints when trying to create ipsec ASO SQ while the timestamp format is real time mode. mlx5_cmd_out_err:778:(pid 34874): CREATE_SQ(0x904) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0xd61c0b), err(-22) mlx5_aso_create_sq:285:(pid 34874): Failed to open aso wq sq, err=-22 mlx5e_ipsec_init:436:(pid 34874): IPSec initialization failed, -22 Fixes: cdd04f4d4d71 ("net/mlx5: Add support to create SQ and CQ for ASO") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reported-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-7-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5e: Update restore chain id for slow path packets	Paul Blakey	2022-10-27	2	-2/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently encap slow path rules just forward to software without setting the chain id miss register, so driver doesn't restore the chain, and packets hitting this rule will restart from tc chain 0 instead of continuing to the chain the encap rule was on. Fix this by setting the chain id miss register to the chain id mapping. Fixes: 8f1e0b97cc70 ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping") Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-6-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5e: Extend SKB room check to include PTP-SQ	Aya Levin	2022-10-27	3	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When tx_port_ts is set, the driver diverts all UPD traffic over PTP port to a dedicated PTP-SQ. The SKBs are cached until the wire-CQE arrives. When the packet size is greater then MTU, the firmware might drop it and the packet won't be transmitted to the wire, hence the wire-CQE won't reach the driver. In this case the SKBs are accumulated in the SKB fifo. Add room check to consider the PTP-SQ SKB fifo, when the SKB fifo is full, driver stops the queue resulting in a TX timeout. Devlink TX-reporter can recover from it. Fixes: 1880bc4e4a96 ("net/mlx5e: Add TX port timestamp support") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-5-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5: DR, Fix matcher disconnect error flow	Rongwei Liu	2022-10-27	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When 2nd flow rules arrives, it will merge together with the 1st one if matcher criteria is the same. If merge fails, driver will rollback the merge contents, and reject the 2nd rule. At rollback stage, matcher can't be disconnected unconditionally, otherise the 1st rule can't be hit anymore. Add logic to check if the matcher should be disconnected or not. Fixes: cc2295cd54e4 ("net/mlx5: DR, Improve steering for empty or RX/TX-only matchers") Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-4-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5: Wait for firmware to enable CRS before pci_restore_state	Moshe Shemesh	2022-10-27	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \|	After firmware reset driver should verify firmware already enabled CRS and became responsive to pci config cycles before restoring pci state. Fix that by waiting till device_id is readable through PCI again. Fixes: eabe8e5e88f5 ("net/mlx5: Handle sync reset now event") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-3-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net/mlx5e: Do not increment ESN when updating IPsec ESN state	Hyong Youb Kim	2022-10-27	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An offloaded SA stops receiving after about 2^32 + replay_window packets. For example, when SA reaches <seq-hi 0x1, seq 0x2c>, all subsequent packets get dropped with SA-icv-failure (integrity_failed). To reproduce the bug: - ConnectX-6 Dx with crypto enabled (FW 22.30.1004) - ipsec.conf: nic-offload = yes replay-window = 32 esn = yes salifetime=24h - Run netperf for a long time to send more than 2^32 packets netperf -H <device-under-test> -t TCP_STREAM -l 20000 When 2^32 + replay_window packets are received, the replay window moves from the 2nd half of subspace (overlap=1) to the 1st half (overlap=0). The driver then updates the 'esn' value in NIC (i.e. seq_hi) as follows. seq_hi = xfrm_replay_seqhi(seq_bottom) new esn in NIC = seq_hi + 1 The +1 increment is wrong, as seq_hi already contains the correct seq_hi. For example, when seq_hi=1, the driver actually tells NIC to use seq_hi=2 (esn). This incorrect esn value causes all subsequent packets to fail integrity checks (SA-icv-failure). So, do not increment. Fixes: cb01008390bb ("net/mlx5: IPSec, Add support for ESN") Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-2-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	netdevsim: remove dir in nsim_dev_debugfs_init() when creating ports dir failed	Zhengchao Shao	2022-10-27	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \|	Remove dir in nsim_dev_debugfs_init() when creating ports dir failed. Otherwise, the netdevsim device will not be created next time. Kernel reports an error: debugfs: Directory 'netdevsim1' with parent 'netdevsim' already present! Fixes: ab1d0cc004d7 ("netdevsim: change debugfs tree topology") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	netdevsim: fix memory leak in nsim_drv_probe() when ↵	Zhengchao Shao	2022-10-27	1	-7/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nsim_dev_resources_register() failed If some items in nsim_dev_resources_register() fail, memory leak will occur. The following is the memory leak information. unreferenced object 0xffff888074c02600 (size 128): comm "echo", pid 8159, jiffies 4294945184 (age 493.530s) hex dump (first 32 bytes): 40 47 ea 89 ff ff ff ff 01 00 00 00 00 00 00 00 @G.............. ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ backtrace: [<0000000011a31c98>] kmalloc_trace+0x22/0x60 [<0000000027384c69>] devl_resource_register+0x144/0x4e0 [<00000000a16db248>] nsim_drv_probe+0x37a/0x1260 [<000000007d1f448c>] really_probe+0x20b/0xb10 [<00000000c416848a>] __driver_probe_device+0x1b3/0x4a0 [<00000000077e0351>] driver_probe_device+0x49/0x140 [<0000000054f2465a>] __device_attach_driver+0x18c/0x2a0 [<000000008538f359>] bus_for_each_drv+0x151/0x1d0 [<0000000038e09747>] __device_attach+0x1c9/0x4e0 [<00000000dd86e533>] bus_probe_device+0x1d5/0x280 [<00000000839bea35>] device_add+0xae0/0x1cb0 [<000000009c2abf46>] new_device_store+0x3b6/0x5f0 [<00000000fb823d7f>] bus_attr_store+0x72/0xa0 [<000000007acc4295>] sysfs_kf_write+0x106/0x160 [<000000005f50cb4d>] kernfs_fop_write_iter+0x3a8/0x5a0 [<0000000075eb41bf>] vfs_write+0x8f0/0xc80 Fixes: 37923ed6b8ce ("netdevsim: Add simple FIB resource controller via devlink") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	netdevsim: fix memory leak in nsim_bus_dev_new()	Zhengchao Shao	2022-10-27	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If device_register() failed in nsim_bus_dev_new(), the value of reference in nsim_bus_dev->dev is 1. obj->name in nsim_bus_dev->dev will not be released. unreferenced object 0xffff88810352c480 (size 16): comm "echo", pid 5691, jiffies 4294945921 (age 133.270s) hex dump (first 16 bytes): 6e 65 74 64 65 76 73 69 6d 31 00 00 00 00 00 00 netdevsim1...... backtrace: [<000000005e2e5e26>] __kmalloc_node_track_caller+0x3a/0xb0 [<0000000094ca4fc8>] kvasprintf+0xc3/0x160 [<00000000aad09bcc>] kvasprintf_const+0x55/0x180 [<000000009bac868d>] kobject_set_name_vargs+0x56/0x150 [<000000007c1a5d70>] dev_set_name+0xbb/0xf0 [<00000000ad0d126b>] device_add+0x1f8/0x1cb0 [<00000000c222ae24>] new_device_store+0x3b6/0x5e0 [<0000000043593421>] bus_attr_store+0x72/0xa0 [<00000000cbb1833a>] sysfs_kf_write+0x106/0x160 [<00000000d0dedb8a>] kernfs_fop_write_iter+0x3a8/0x5a0 [<00000000770b66e2>] vfs_write+0x8f0/0xc80 [<0000000078bb39be>] ksys_write+0x106/0x210 [<00000000005e55a4>] do_syscall_64+0x35/0x80 [<00000000eaa40bbc>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: 40e4fe4ce115 ("netdevsim: move device registration and related code to bus.c") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20221026015405.128795-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	Merge tag 'linux-can-fixes-for-6.1-20221027' of ↵	Jakub Kicinski	2022-10-27	3	-17/+15
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2022-10-27 Anssi Hannula fixes the use of the completions in the kvaser_usb driver. Biju Das contributes 2 patches for the rcar_canfd driver. A IRQ storm that can be triggered by high CAN bus load and channel specific IRQ handlers are fixed. Yang Yingliang fixes the j1939 transport protocol by moving a kfree_skb() out of a spin_lock_irqsave protected section. * tag 'linux-can-fixes-for-6.1-20221027' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: j1939: transport: j1939_session_skb_drop_old(): spin_unlock_irqrestore() before kfree_skb() can: rcar_canfd: fix channel specific IRQ handling for RZ/G2L can: rcar_canfd: rcar_canfd_handle_global_receive(): fix IRQ storm on global FIFO receive can: kvaser_usb: Fix possible completions during init_completion ==================== Link: https://lore.kernel.org/r/20221027114356.1939821-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| *	can: rcar_canfd: fix channel specific IRQ handling for RZ/G2L	Biju Das	2022-10-27	1	-11/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RZ/G2L has separate channel specific IRQs for transmit and error interrupts. But the IRQ handler processes both channels, even if there no interrupt occurred on one of the channels. This patch fixes the issue by passing a channel specific context parameter instead of global one for the IRQ register and the IRQ handler, it just handles the channel which is triggered the interrupt. Fixes: 76e9353a80e9 ("can: rcar_canfd: Add support for RZ/G2L family") Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://lore.kernel.org/all/20221025155657.1426948-3-biju.das.jz@bp.renesas.com Cc: stable@vger.kernel.org [mkl: adjust commit message] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
\| *	can: rcar_canfd: rcar_canfd_handle_global_receive(): fix IRQ storm on global ↵	Biju Das	2022-10-27	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	FIFO receive We are seeing an IRQ storm on the global receive IRQ line under heavy CAN bus load conditions with both CAN channels enabled. Conditions: The global receive IRQ line is shared between can0 and can1, either of the channels can trigger interrupt while the other channel's IRQ line is disabled (RFIE). When global a receive IRQ interrupt occurs, we mask the interrupt in the IRQ handler. Clearing and unmasking of the interrupt is happening in rx_poll(). There is a race condition where rx_poll() unmasks the interrupt, but the next IRQ handler does not mask the IRQ due to NAPIF_STATE_MISSED flag (e.g.: can0 RX FIFO interrupt is disabled and can1 is triggering RX interrupt, the delay in rx_poll() processing results in setting NAPIF_STATE_MISSED flag) leading to an IRQ storm. This patch fixes the issue by checking IRQ active and enabled before handling the IRQ on a particular channel. Fixes: dd3bd23eb438 ("can: rcar_canfd: Add Renesas R-Car CAN FD driver") Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://lore.kernel.org/all/20221025155657.1426948-2-biju.das.jz@bp.renesas.com Cc: stable@vger.kernel.org [mkl: adjust commit message] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
\| *	can: kvaser_usb: Fix possible completions during init_completion	Anssi Hannula	2022-10-27	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	kvaser_usb uses completions to signal when a response event is received for outgoing commands. However, it uses init_completion() to reinitialize the start_comp and stop_comp completions before sending the start/stop commands. In case the device sends the corresponding response just before the actual command is sent, complete() may be called concurrently with init_completion() which is not safe. This might be triggerable even with a properly functioning device by stopping the interface (CMD_STOP_CHIP) just after it goes bus-off (which also causes the driver to send CMD_STOP_CHIP when restart-ms is off), but that was not tested. Fix the issue by using reinit_completion() instead. Fixes: 080f40a6fa28 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices") Tested-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Link: https://lore.kernel.org/all/20221010185237.319219-2-extja@kvaser.com Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* \|	net: broadcom: bcm4908_enet: update TX stats after actual transmission	Rafał Miłecki	2022-10-27	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Queueing packets doesn't guarantee their transmission. Update TX stats after hardware confirms consuming submitted data. This also fixes a possible race and NULL dereference. bcm4908_enet_start_xmit() could try to access skb after freeing it in the bcm4908_enet_poll_tx(). Reported-by: Florian Fainelli <f.fainelli@gmail.com> Fixes: 4feffeadbcb2e ("net: broadcom: bcm4908enet: add BCM4908 controller driver") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20221027112430.8696-1-zajec5@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* \|	net: bcmsysport: Indicate MAC is in charge of PHY PM	Florian Fainelli	2022-10-27	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid the PHY library call unnecessarily into the suspend/resume functions by setting phydev->mac_managed_pm to true. The SYSTEMPORT driver essentially does exactly what mdio_bus_phy_resume() does by calling phy_resume(). Fixes: fba863b81604 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20221025234201.2549360-1-f.fainelli@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* \|	net: ehea: fix possible memory leak in ehea_register_port()	Yang Yingliang	2022-10-27	1	-0/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \|	If of_device_register() returns error, the of node and the name allocated in dev_set_name() is leaked, call put_device() to give up the reference that was set in device_initialize(), so that of node is put in logical_port_release() and the name is freed in kobject_cleanup(). Fixes: 1acf2318dd13 ("ehea: dynamic add / remove port") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221025130011.1071357-1-yangyingliang@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
*	net: ethernet: ave: Fix MAC to be in charge of PHY PM	Kunihiko Hayashi	2022-10-26	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The phylib callback is called after MAC driver's own resume callback is called. For AVE driver, after resuming immediately, PHY state machine is in PHY_NOLINK because there is a time lag from link-down to link-up due to autoneg. The result is WARN_ON() dump in mdio_bus_phy_resume(). Since ave_resume() itself calls phy_resume(), AVE driver should manage PHY PM. To indicate that MAC driver manages PHY PM, set phydev->mac_managed_pm to true to avoid the unnecessary phylib call and add missing phy_init_hw() to ave_resume(). Suggested-by: Heiner Kallweit <hkallweit1@gmail.com> Fixes: fba863b81604 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM") Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Link: https://lore.kernel.org/r/20221024072227.24769-1-hayashi.kunihiko@socionext.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net: fec: limit register access on i.MX6UL	Juergen Borleis	2022-10-26	1	-2/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using 'ethtool -d […]' on an i.MX6UL leads to a kernel crash: Unhandled fault: external abort on non-linefetch (0x1008) at […] due to this SoC has less registers in its FEC implementation compared to other i.MX6 variants. Thus, a run-time decision is required to avoid access to non-existing registers. Fixes: a51d3ab50702 ("net: fec: use a more proper compatible string for i.MX6UL type device") Signed-off-by: Juergen Borleis <jbe@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20221024080552.21004-1-jbe@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	Merge tag 'linux-can-fixes-for-6.1-20221025' of ↵	Jakub Kicinski	2022-10-26	2	-4/+9
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2022-10-25 The 1st patch adds a missing cleanup call in the error path of the probe function in mpc5xxx glue code for the mscan driver. The 2nd patch adds a missing cleanup call in the error path of the probe function of the mcp251x driver. * tag 'linux-can-fixes-for-6.1-20221025' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: mcp251x: mcp251x_can_probe(): add missing unregister_candev() in error path can: mscan: mpc5xxx: mpc5xxx_can_probe(): add missing put_clock() in error path ==================== Link: https://lore.kernel.org/r/20221026075520.1502520-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| *	can: mcp251x: mcp251x_can_probe(): add missing unregister_candev() in error path	Dongliang Mu	2022-10-25	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In mcp251x_can_probe(), if mcp251x_gpio_setup() fails, it forgets to unregister the CAN device. Fix this by unregistering can device in mcp251x_can_probe(). Fixes: 2d52dabbef60 ("can: mcp251x: add GPIO support") Signed-off-by: Dongliang Mu <dzm91@hust.edu.cn> Link: https://lore.kernel.org/all/20221024090256.717236-1-dzm91@hust.edu.cn [mkl: adjust label] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
\| *	can: mscan: mpc5xxx: mpc5xxx_can_probe(): add missing put_clock() in error path	Dongliang Mu	2022-10-25	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The commit 1149108e2fbf ("can: mscan: improve clock API use") only adds put_clock() in mpc5xxx_can_remove() function, forgetting to add put_clock() in the error handling code. Fix this bug by adding put_clock() in the error handling code. Fixes: 1149108e2fbf ("can: mscan: improve clock API use") Signed-off-by: Dongliang Mu <dzm91@hust.edu.cn> Link: https://lore.kernel.org/all/20221024133828.35881-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* \|	net: ipa: don't configure IDLE_INDICATION on v3.1	Caleb Connolly	2022-10-25	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IPA v3.1 doesn't support the IDLE_INDICATION_CFG register, this was causing a harmless splat in ipa_idle_indication_cfg(), add a version check to prevent trying to fetch this register on v3.1 Fixes: 6a244b75cfab ("net: ipa: introduce ipa_reg()") Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org> Reviewed-by: Alex Elder <elder@linaro.org> Tested-by: Jami Kettunen <jami.kettunen@somainline.org> Link: https://lore.kernel.org/r/20221024234850.4049778-1-caleb.connolly@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* \|	net: ipa: fix v3.1 resource limit masks	Caleb Connolly	2022-10-25	1	-64/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The resource group limits for IPA v3.1 mistakenly used 6 bit wide mask values, when the hardware actually uses 8. Out of range values were silently ignored before, so the IPA worked as expected. However the new generalised register definitions introduce stricter checking here, they now cause some splats and result in the value 0 being written instead. Fix the limit bitmask widths so that the correct values can be written. Fixes: 1c418c4a929c ("net: ipa: define resource group/type IPA register fields") Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org> Reviewed-by: Alex Elder <elder@linaro.org> Tested-by: Jami Kettunen <jami.kettunen@somainline.org> Link: https://lore.kernel.org/r/20221024210336.4014983-2-caleb.connolly@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* \|	net: ipa: fix v3.5.1 resource limit max values	Caleb Connolly	2022-10-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some resource limits on IPA v3.5.1 have their max values set to 255, this causes a few splats in ipa_reg_encode and prevents the IPA from booting properly. The limits are all 6 bits wide so adjust the max values to 63. Fixes: 1c418c4a929c ("net: ipa: define resource group/type IPA register fields") Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org> Reviewed-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20221024210336.4014983-1-caleb.connolly@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* \|	net: ksz884x: fix missing pci_disable_device() on error in pcidev_init()	Yang Yingliang	2022-10-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pci_disable_device() need be called while module exiting, switch to use pcim_enable(), pci_disable_device() will be called in pcim_release() while unbinding device. Fixes: 8ca86fd83eae ("net: Micrel KSZ8841/2 PCI Ethernet driver") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221024131338.2848959-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* \|	i40e: Fix flow-type by setting GL_HASH_INSET registers	Slawomir Laba	2022-10-25	1	-33/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix setting bits for specific flow_type for GLQF_HASH_INSET register. In previous version all of the bits were set only in hena register, while in inset only one bit was set. In order for this working correctly on all types of cards these bits needs to be set correctly for both hena and inset registers. Fixes: eb0dd6e4a3b3 ("i40e: Allow RSS Hash set with less than four parameters") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20221024100526.1874914-3-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* \|	i40e: Fix VF hang when reset is triggered on another VF	Sylwester Dziedziuch	2022-10-25	2	-11/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a reset was triggered on one VF with i40e_reset_vf global PF state __I40E_VF_DISABLE was set on a PF until the reset finished. If immediately after triggering reset on one VF there is a request to reset on another it will cause a hang on VF side because VF will be notified of incoming reset but the reset will never happen because of this global state, we will get such error message: [ +4.890195] iavf 0000:86:02.1: Never saw reset and VF will hang waiting for the reset to be triggered. Fix this by introducing new VF state I40E_VF_STATE_RESETTING that will be set on a VF if it is currently resetting instead of the global __I40E_VF_DISABLE PF state. Fixes: 3ba9bcb4b68f ("i40e: add locking around VF reset") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20221024100526.1874914-2-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* \|	i40e: Fix ethtool rx-flow-hash setting for X722	Slawomir Laba	2022-10-25	2	-8/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When enabling flow type for RSS hash via ethtool: ethtool -N $pf rx-flow-hash tcp4\|tcp6\|udp4\|udp6 s\|d the driver would fail to setup this setting on X722 device since it was using the mask on the register dedicated for X710 devices. Apply a different mask on the register when setting the RSS hash for the X722 device. When displaying the flow types enabled via ethtool: ethtool -n $pf rx-flow-hash tcp4\|tcp6\|udp4\|udp6 the driver would print wrong values for X722 device. Fix this issue by testing masks for X722 device in i40e_get_rss_hash_opts function. Fixes: eb0dd6e4a3b3 ("i40e: Allow RSS Hash set with less than four parameters") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20221024100526.1874914-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* \|	net: stmmac: rk3588: Allow multiple gmac controller	Benjamin Gaignard	2022-10-25	1	-0/+6
\|/ \| \| \| \| \| \| \| \| \| \|	RK3588(s) can have multiple gmac controllers. Re-use rk3568 logic to distinguish them. Fixes: 2f2b60a0ec28 ("net: ethernet: stmmac: dwmac-rk: Add gmac support for rk3588") Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Link: https://lore.kernel.org/r/20221021172422.88534-1-sebastian.reichel@collabora.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
*	net: lan966x: Stop replacing tx dcbs and dcbs_buf when changing MTU	Horatiu Vultur	2022-10-24	1	-21/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a frame is sent using FDMA, the skb is mapped and then the mapped address is given to an tx dcb that is different than the last used tx dcb. Once the HW finish with this frame, it would generate an interrupt and then the dcb can be reused and memory can be freed. For each dcb there is an dcb buf that contains some meta-data(is used by PTP, is it free). There is 1 to 1 relationship between dcb and dcb_buf. The following issue was observed. That sometimes after changing the MTU to allocate new tx dcbs and dcbs_buf, two frames were not transmitted. The frames were not transmitted because when reloading the tx dcbs, it was always presuming to use the first dcb but that was not always happening. Because it could be that the last tx dcb used before changing MTU was first dcb and then when it tried to get the next dcb it would take dcb 1 instead of 0. Because it is supposed to take a different dcb than the last used one. This can be fixed simply by changing tx->last_in_use to -1 when the fdma is disabled to reload the new dcb and dcbs_buff. But there could be a different issue. For example, right after the frame is sent, the MTU is changed. Now all the dcbs and dcbs_buf will be cleared. And now get the interrupt from HW that it finished with the frame. So when we try to clear the skb, it is not possible because we lost all the dcbs_buf. The solution here is to stop replacing the tx dcbs and dcbs_buf when changing MTU because the TX doesn't care what is the MTU size, it is only the RX that needs this information. Fixes: 2ea1cbac267e ("net: lan966x: Update FDMA to change MTU.") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20221021090711.3749009-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net: lantiq_etop: don't free skb when returning NETDEV_TX_BUSY	Zhang Changzhong	2022-10-24	1	-1/+0
\| \| \| \| \| \| \| \| \|	The ndo_start_xmit() method must not free skb when returning NETDEV_TX_BUSY, since caller is going to requeue freed skb. Fixes: 504d4721ee8e ("MIPS: Lantiq: Add ethernet driver") Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: fman: Use physical address for userspace interfaces	Sean Anderson	2022-10-24	4	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before 262f2b782e25 ("net: fman: Map the base address once"), the physical address of the MAC was exposed to userspace in two places: via sysfs and via SIOCGIFMAP. While this is not best practice, it is an external ABI which is in use by userspace software. The aforementioned commit inadvertently modified these addresses and made them virtual. This constitutes and ABI break. Additionally, it leaks the kernel's memory layout to userspace. Partially revert that commit, reintroducing the resource back into struct mac_device, while keeping the intended changes (the rework of the address mapping). Fixes: 262f2b782e25 ("net: fman: Map the base address once") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net/mlx5e: Cleanup MACsec uninitialization routine	Leon Romanovsky	2022-10-24	1	-10/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The mlx5e_macsec_cleanup() routine has NULL pointer dereferencing if mlx5 device doesn't support MACsec (priv->macsec will be NULL). While at it delete comment line, assignment and extra blank lines, so fix everything in one patch. Fixes: 1f53da676439 ("net/mlx5e: Create advanced steering operation (ASO) object for MACsec") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	atlantic: fix deadlock at aq_nic_stop	Íñigo Huguet	2022-10-24	2	-24/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NIC is stopped with rtnl_lock held, and during the stop it cancels the 'service_task' work and free irqs. However, if CONFIG_MACSEC is set, rtnl_lock is acquired both from aq_nic_service_task and aq_linkstate_threaded_isr. Then a deadlock happens if aq_nic_stop tries to cancel/disable them when they've already started their execution. As the deadlock is caused by rtnl_lock, it causes many other processes to stall, not only atlantic related stuff. Fix it by introducing a mutex that protects each NIC's macsec related data, and locking it instead of the rtnl_lock from the service task and the threaded IRQ. Before this patch, all macsec data was protected with rtnl_lock, but maybe not all of it needs to be protected. With this new mutex, further efforts can be made to limit the protected data only to that which requires it. However, probably it doesn't worth it because all macsec's data accesses are infrequent, and almost all are done from macsec_ops or ethtool callbacks, called holding rtnl_lock, so macsec_mutex won't never be much contended. The issue appeared repeteadly attaching and deattaching the NIC to a bond interface. Doing that after this patch I cannot reproduce the bug. Fixes: 62c1c2e606f6 ("net: atlantic: MACSec offload skeleton") Reported-by: Li Liang <liali@redhat.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	nfp: only clean `sp_indiff` when application firmware is unloaded	Yinjun Zhang	2022-10-21	1	-23/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently `sp_indiff` is cleaned when driver is removed. This will cause problem in multi-PF/multi-host case, considering one PF is removed while another is still in use. Since `sp_indiff` is the application firmware property, it should only be cleaned when the firmware is unloaded. Now let management firmware to clean it when necessary, driver only set it. Fixes: b1e4f11e426d ("nfp: refine the ABI of getting `sp_indiff` info") Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20221020081411.80186-1-simon.horman@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	amd-xgbe: add the bit rate quirk for Molex cables	Raju Rangoju	2022-10-21	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	The offset 12 (bit-rate) of EEPROM SFP DAC (passive) cables is expected to be in the range 0x64 to 0x68. However, the 5 meter and 7 meter Molex passive cables have the rate ceiling 0x78 at offset 12. Add a quirk for Molex passive cables to extend the rate ceiling to 0x78. Fixes: abf0a1c2b26a ("amd-xgbe: Add support for SFP+ modules") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	amd-xgbe: fix the SFP compliance codes check for DAC cables	Raju Rangoju	2022-10-21	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current XGBE code assumes that offset 6 of EEPROM SFP DAC (passive) cables is NULL. However, some cables (the 5 meter and 7 meter Molex passive cables) have non-zero data at offset 6. Fix the logic by moving the passive cable check above the active checks, so as not to be improperly identified as an active cable. This will fix the issue for any passive cable that advertises 1000Base-CX in offset 6. Fixes: abf0a1c2b26a ("amd-xgbe: Add support for SFP+ modules") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	amd-xgbe: enable PLL_CTL for fixed PHY modes only	Raju Rangoju	2022-10-21	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PLL control setting(RRC) is needed only in fixed PHY configuration to fix the peer-peer issues. Without the PLL control setting, the link up takes longer time in a fixed phy configuration. Driver implements SW RRC for Autoneg On configuration, hence PLL control setting (RRC) is not needed for AN On configuration, and can be skipped. Also, PLL re-initialization is not needed for PHY Power Off and RRC commands. Otherwise, they lead to mailbox errors. Added the changes accordingly. Fixes: daf182d360e5 ("net: amd-xgbe: Toggle PLL settings during rate change") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	amd-xgbe: use enums for mailbox cmd and sub_cmds	Raju Rangoju	2022-10-21	2	-13/+41
\| \| \| \| \| \| \| \| \|	Instead of using hardcoded values, use enumerations for mailbox command and sub commands. Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	amd-xgbe: Yellow carp devices do not need rrc	Raju Rangoju	2022-10-21	3	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	Link stability issues are noticed on Yellow carp platforms when Receiver Reset Cycle is issued. Since the CDR workaround is disabled on these platforms, the Receiver Reset Cycle is not needed. So, avoid issuing rrc on Yellow carp platforms. Fixes: dbb6c58b5a61 ("net: amd-xgbe: Add Support for Yellow Carp Ethernet device") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>