summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet
Commit message (Collapse)AuthorAgeFilesLines
* net: hns3: Trivial spell fix in hns3 driverSalil Mehta2021-04-092-3/+3
| | | | | | | | | Some trivial spelling mistakes which caught my eye during the review of the code. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Link: https://lore.kernel.org/r/20210409074223.32480-1-salil.mehta@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* lan743x: fix ethernet frame cutoff issueSven Van Asbroeck2021-04-091-4/+4
| | | | | | | | | | | | | | | | | | | | | The ethernet frame length is calculated incorrectly. Depending on the value of RX_HEAD_PADDING, this may result in ethernet frames that are too short (cut off at the end), or too long (garbage added to the end). Fix by calculating the ethernet frame length correctly. For added clarity, use the ETH_FCS_LEN constant in the calculation. Many thanks to Heiner Kallweit for suggesting this solution. Suggested-by: Heiner Kallweit <hkallweit1@gmail.com> Fixes: 3e21a10fdea3 ("lan743x: trim all 4 bytes of the FCS; not just 2") Link: https://lore.kernel.org/lkml/20210408172353.21143-1-TheSven73@gmail.com/ Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Reviewed-by: George McCollister <george.mccollister@gmail.com> Tested-by: George McCollister <george.mccollister@gmail.com> Link: https://lore.kernel.org/r/20210409003904.8957-1-TheSven73@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* ice: fix memory leak of aRFS after resuming from suspendYongxin Liu2021-04-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In ice_suspend(), ice_clear_interrupt_scheme() is called, and then irq_free_descs() will be eventually called to free irq and its descriptor. In ice_resume(), ice_init_interrupt_scheme() is called to allocate new irqs. However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap maybe cannot be freed, if the irqs that released in ice_suspend() were reassigned to other devices, which makes irq descriptor's affinity_notify lost. So call ice_free_cpu_rx_rmap() before ice_clear_interrupt_scheme(), which can make sure all irq_glue and cpu_rmap can be correctly released before corresponding irq and descriptor are released. Fix the following memory leak. unreferenced object 0xffff95bd951afc00 (size 512): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff ........p....... 00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff ................ backtrace: [<0000000072e4b914>] __kmalloc+0x336/0x540 [<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0 [<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 unreferenced object 0xffff95bd81b0a2a0 (size 96): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00 8............... b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff ................ backtrace: [<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0 [<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0 [<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 Fixes: 769c500dcc1e ("ice: Add advanced power mgmt for WoL") Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* i40e: Fix sparse warning: missing error code 'err'Arkadiusz Kubalewski2021-04-081-2/+6
| | | | | | | | | | | | | Set proper return values inside error checking if-statements. Previously following warning was produced when compiling against sparse. i40e_main.c:15162 i40e_init_recovery_mode() warn: missing error code 'err' Fixes: 4ff0ee1af0169 ("i40e: Introduce recovery mode support") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* i40e: Fix sparse error: 'vsi->netdev' could be nullArkadiusz Kubalewski2021-04-081-2/+1
| | | | | | | | | | | | | | | | Remove vsi->netdev->name from the trace. This is redundant information. With the devinfo trace, the adapter is already identifiable. Previously following error was produced when compiling against sparse. i40e_main.c:2571 i40e_sync_vsi_filters() error: we previously assumed 'vsi->netdev' could be null (see line 2323) Fixes: b603f9dc20af ("i40e: Log info when PF is entering and leaving Allmulti mode.") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* i40e: Fix sparse error: uninitialized symbol 'ring'Arkadiusz Kubalewski2021-04-081-0/+3
| | | | | | | | | | | | | Init pointer with NULL in default switch case statement. Previously the error was produced when compiling against sparse. i40e_debugfs.c:582 i40e_dbg_dump_desc() error: uninitialized symbol 'ring'. Fixes: 44ea803e2fa7 ("i40e: introduce new dump desc XDP command") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* i40e: Fix sparse errors in i40e_txrx.cArkadiusz Kubalewski2021-04-081-7/+5
| | | | | | | | | | | | | | | | | | | | | Remove error handling through pointers. Instead use plain int to return value from i40e_run_xdp(...). Previously: - sparse errors were produced during compilation: i40e_txrx.c:2338 i40e_run_xdp() error: (-2147483647) too low for ERR_PTR i40e_txrx.c:2558 i40e_clean_rx_irq() error: 'skb' dereferencing possible ERR_PTR() - sk_buff* was used to return value, but it has never had valid pointer to sk_buff. Returned value was always int handled as a pointer. Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions") Fixes: 2e6893123830 ("i40e: split XDP_TX tail and XDP_REDIRECT map flushing") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* i40e: Fix parameters in aq_get_phy_register()Grzegorz Siwik2021-04-081-1/+1
| | | | | | | | | | | | Change parameters order in aq_get_phy_register() due to wrong statistics in PHY reported by ethtool. Previously all PHY statistics were exactly the same for all interfaces Now statistics are reported correctly - different for different interfaces Fixes: 0514db37dd78 ("i40e: Extend PHY access with page change flag") Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* ethtool: Remove link_mode param and derive link params from driverDanielle Ratson2021-04-071-5/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some drivers clear the 'ethtool_link_ksettings' struct in their get_link_ksettings() callback, before populating it with actual values. Such drivers will set the new 'link_mode' field to zero, resulting in user space receiving wrong link mode information given that zero is a valid value for the field. Another problem is that some drivers (notably tun) can report random values in the 'link_mode' field. This can result in a general protection fault when the field is used as an index to the 'link_mode_params' array [1]. This happens because such drivers implement their set_link_ksettings() callback by simply overwriting their private copy of 'ethtool_link_ksettings' struct with the one they get from the stack, which is not always properly initialized. Fix these problems by removing 'link_mode' from 'ethtool_link_ksettings' and instead have drivers call ethtool_params_from_link_mode() with the current link mode. The function will derive the link parameters (e.g., speed) from the link mode and fill them in the 'ethtool_link_ksettings' struct. v3: * Remove link_mode parameter and derive the link parameters in the driver instead of passing link_mode parameter to ethtool and derive it there. v2: * Introduce 'cap_link_mode_supported' instead of adding a validity field to 'ethtool_link_ksettings' struct. [1] general protection fault, probably for non-canonical address 0xdffffc00f14cc32c: 0000 [#1] PREEMPT SMP KASAN KASAN: probably user-memory-access in range [0x000000078a661960-0x000000078a661967] CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446 Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 +38 d0 7c 08 84 d2 0f 85 b9 RSP: 0018:ffffc900019df7a0 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888026136008 RCX: 0000000000000000 RDX: 00000000f14cc32c RSI: ffffffff873439ca RDI: 000000078a661960 RBP: 00000000ffff8880 R08: 00000000ffffffff R09: ffff88802613606f R10: ffffffff873439bc R11: 0000000000000000 R12: 0000000000000000 R13: ffff88802613606c R14: ffff888011d0c210 R15: ffff888011d0c210 FS: 0000000000749300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004b60f0 CR3: 00000000185c2000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37 ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586 ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656 ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620 dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842 dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440 sock_do_ioctl+0x148/0x2d0 net/socket.c:1060 sock_ioctl+0x477/0x6a0 net/socket.c:1177 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: c8907043c6ac9 ("ethtool: Get link mode in use instead of speed and duplex parameters") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx5: fix kfree mismatch in indir_table.cXiaoming Ni2021-04-061-5/+5
| | | | | | | | | Memory allocated by kvzalloc() should be freed by kvfree(). Fixes: 34ca65352ddf2 ("net/mlx5: E-Switch, Indirect table infrastructur") Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5: Fix HW spec violation configuring uplinkEli Cohen2021-04-061-2/+3
| | | | | | | | | | | | Make sure to modify uplink port to follow only if the uplink_follow capability is set as required by the HW spec. Failure to do so causes traffic to the uplink representor net device to cease after switching to switchdev mode. Fixes: 7d0314b11cdd ("net/mlx5e: Modify uplink state on interface up/down") Signed-off-by: Eli Cohen <elic@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net: hns3: clear VF down state bit before request link statusGuangbin Huang2021-04-061-2/+2
| | | | | | | | | | | | | | | | Currently, the VF down state bit is cleared after VF sending link status request command. There is problem that when VF gets link status replied from PF, the down state bit may still set as 1. In this case, the link status replied from PF will be ignored and always set VF link status to down. To fix this problem, clear VF down state bit before VF requests link status. Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pcnet32: Use pci_resource_len to validate PCI resourceGuenter Roeck2021-04-061-2/+3
| | | | | | | | | | | | | | | | | pci_resource_start() is not a good indicator to determine if a PCI resource exists or not, since the resource may start at address 0. This is seen when trying to instantiate the driver in qemu for riscv32 or riscv64. pci 0000:00:01.0: reg 0x10: [io 0x0000-0x001f] pci 0000:00:01.0: reg 0x14: [mem 0x00000000-0x0000001f] ... pcnet32: card has no PCI IO resources, aborting Use pci_resouce_len() instead. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: broadcom: bcm4908enet: Fix a double free in bcm4908_enet_dma_allocLv Yunlong2021-04-061-0/+1
| | | | | | | | | | | | | | In bcm4908_enet_dma_alloc, if callee bcm4908_dma_alloc_buf_descs() failed, it will free the ring->cpu_addr by dma_free_coherent() and return error. Then bcm4908_enet_dma_free() will be called, and free the same cpu_addr by dma_free_coherent() again. My patch set ring->cpu_addr to NULL after it is freed in bcm4908_dma_alloc_buf_descs() to avoid the double free. Fixes: 4feffeadbcb2e ("net: broadcom: bcm4908enet: add BCM4908 controller driver") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: hns3: Remove un-necessary 'else-if' in the hclge_reset_event()Salil Mehta2021-04-051-1/+3
| | | | | | | | | Code to defer the reset(which caps the frequency of the reset) schedules the timer and returns. Hence, following 'else-if' looks un-necessary. Fixes: 9de0b86f6444 ("net: hns3: Prevent to request reset frequently") Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: hns3: Remove the left over redundant check & assignmentSalil Mehta2021-04-051-3/+0
| | | | | | | | | | This removes the left over check and assignment which is no longer used anywhere in the function and should have been removed as part of the below mentioned patch. Fixes: 012fcb52f67c ("net: hns3: activate reset timer when calling reset_event") Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: macb: restore cmp registers on resume pathClaudiu Beznea2021-04-021-0/+7
| | | | | | | | | Restore CMP screener registers on resume path. Fixes: c1e85c6ce57ef ("net: macb: save/restore the remaining registers and features") Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* i40e: Fix display statistics for veb_tcEryk Rybak2021-04-011-6/+46
| | | | | | | | | | | | | | | | If veb-stats was enabled, the ethtool stats triggered a warning due to invalid size: 'unexpected stat size for veb.tc_%u_tx_packets'. This was due to an incorrect structure definition for the statistics. Structures and functions have been improved in line with requirements for the presentation of statistics, in particular for the functions: 'i40e_add_ethtool_stats' and 'i40e_add_stat_strings'. Fixes: 1510ae0be2a4 ("i40e: convert VEB TC stats to use an i40e_stats array") Signed-off-by: Eryk Rybak <eryk.roch.rybak@intel.com> Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* i40e: fix receiving of single packets in xsk zero-copy modeMagnus Karlsson2021-04-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix so that single packets are received immediately instead of in batches of 8. If you sent 1 pps to a system, you received 8 packets every 8 seconds instead of 1 packet every second. The problem behind this was that the work_done reporting from the Tx part of the driver was broken. The work_done reporting in i40e controls not only the reporting back to the napi logic but also the setting of the interrupt throttling logic. When Tx or Rx reports that it has more to do, interrupts are throttled or coalesced and when they both report that they are done, interrupts are armed right away. If the wrong work_done value is returned, the logic will start to throttle interrupts in a situation where it should have just enabled them. This leads to the undesired batching behavior seen in user-space. Fix this by returning the correct boolean value from the Tx xsk zero-copy path. Return true if there is nothing to do or if we got fewer packets to process than we asked for. Return false if we got as many packets as the budget since there might be more packets we can process. Fixes: 3106c580fb7c ("i40e: Use batched xsk Tx interfaces to increase performance") Reported-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* i40e: Fix inconsistent indentingArkadiusz Kubalewski2021-04-011-4/+4
| | | | | | | | | | | | | Fixed new static analysis findings: "warn: inconsistent indenting" - introduced lately, reported with lkp and smatch build. Fixes: 4b208eaa8078 ("i40e: Add init and default config of software based DCB") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* net/mlx5e: Guarantee room for XSK wakeup NOP on async ICOSQTariq Toukan2021-03-314-12/+34
| | | | | | | | | | | | | | | | | | XSK wakeup flow triggers an IRQ by posting a NOP WQE and hitting the doorbell on the async ICOSQ. It maintains its state so that it doesn't issue another NOP WQE if it has an outstanding one already. For this flow to work properly, the NOP post must not fail. Make sure to reserve room for the NOP WQE in all WQE posts to the async ICOSQ. Fixes: 8d94b590f1e4 ("net/mlx5e: Turn XSK ICOSQ into a general asynchronous one") Fixes: 1182f3659357 ("net/mlx5e: kTLS, Add kTLS RX HW offload support") Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: Consider geneve_opts for encap contextsDima Chumak2021-03-316-14/+51
| | | | | | | | | | | | | | | | | Current algorithm for encap keys is legacy from initial vxlan implementation and doesn't take into account all possible fields of a tunnel. For example, for a Geneve tunnel, which may have additional TLV options, they are ignored when comparing encap keys and a rule can be attached to an incorrect encap entry. Fix that by introducing encap_info_equal() operation in struct mlx5e_tc_tunnel. Geneve tunnel type uses custom implementation, which extends generic algorithm and considers options if they are set. Fixes: 7f1a546e3222 ("net/mlx5e: Consider tunnel type for encap contexts") Signed-off-by: Dima Chumak <dchumak@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5: Don't request more than supported EQsDaniel Jurgens2021-03-311-1/+12
| | | | | | | | | | | | | Calculating the number of compeltion EQs based on the number of available IRQ vectors doesn't work now that all async EQs share one IRQ. Thus the max number of EQs can be exceeded on systems with more than approximately 256 CPUs. Take this into account when calculating the number of available completion EQs. Fixes: 81bfa206032a ("net/mlx5: Use a single IRQ for all async EQs") Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: kTLS, Fix RX counters atomicityTariq Toukan2021-03-315-20/+16
| | | | | | | | | | | | | | | | | | | | | Some TLS RX counters increment per socket/connection, and are not protected against parallel modifications from several cores. Switch them to atomic counters by taking them out of the RQ stats into the global atomic TLS stats. In this patch, we touch 'rx_tls_ctx/del' that count the number of device-offloaded RX TLS connections added/deleted. These counters are updated in the add/del callbacks, out of the fast data-path. This change is not needed for counters that increment only in NAPI context, as they are protected by the NAPI mechanism. Keep them as tls_* counters under 'struct mlx5e_rq_stats'. Fixes: 76c1e1ac2aae ("net/mlx5e: kTLS, Add kTLS RX stats") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: kTLS, Fix TX counters atomicityTariq Toukan2021-03-315-26/+33
| | | | | | | | | | | | | | | | | | | | | | Some TLS TX counters increment per socket/connection, and are not protected against parallel modifications from several cores. Switch them to atomic counters by taking them out of the SQ stats into the global atomic TLS stats. In this patch, we touch a single counter 'tx_tls_ctx' that counts the number of device-offloaded TX TLS connections added. Now that this counter can be increased without the for having the SQ context in hand, move it to the mlx5e_ktls_add_tx() callback where it really belongs, out of the fast data-path. This change is not needed for counters that increment only in NAPI context or under the TX lock, as they are already protected. Keep them as tls_* counters under 'struct mlx5e_sq_stats'. Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5: E-switch, Create vport miss group only if src rewrite is supportedMaor Dickman2021-03-311-29/+39
| | | | | | | | | | | | | | | Create send to vport miss group was added in order to support traffic recirculation to root table with metadata source rewrite. This group is created also in case source rewrite isn't supported. Fixed by creating send to vport miss group only if source rewrite is supported by FW. Fixes: 8e404fefa58b ("net/mlx5e: Match recirculated packet miss in slow table using reg_c1") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: Fix ethtool indication of connector typeAya Levin2021-03-311-11/+11
| | | | | | | | | | Use connector_type read from PTYS register when it's valid, based on corresponding capability bit. Fixes: 5b4793f81745 ("net/mlx5e: Add support for reading connector type from PTYS") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5: Delete auxiliary bus driver eth-rep firstMaor Dickman2021-03-311-2/+2
| | | | | | | | | | | | | | | Delete auxiliary bus drivers flow deletes the eth driver first and then the eth-reps driver but eth-reps devices resources are depend on eth device. Fixed by changing the delete order of auxiliary bus drivers to delete the eth-rep driver first and after it the eth driver. Fixes: 601c10c89cbb ("net/mlx5: Delete custom device management logic") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* net/mlx5e: Fix mapping of ct_label zeroAriel Levkovich2021-03-311-7/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | ct_label 0 is a default label each flow has and therefore there can be rules that match on ct_label=0 without a prior rule that set the ct_label to this value. The ct_label value is not used directly in the HW rules and instead it is mapped to some id within a defined range and this id is used to set and match the metadata register which carries the ct_label. If we have a rule that matches on ct_label=0, the hw rule will perform matching on a value that is != 0 because of the mapping from label to id. Since the metadata register default value is 0 and it was never set before to anything else by an action that sets the ct_label, there will always be a mismatch between that register and the value in the rule. To support such rule, a forced mapping of ct_label 0 to id=0 is done so that it will match the metadata register default value of 0. Fixes: 54b154ecfb8c ("net/mlx5e: CT: Map 128 bits labels to 32 bit map ID") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* nfp: flower: ignore duplicate merge hints from FWYinjun Zhang2021-03-303-3/+69
| | | | | | | | | | | | | | | | | A merge hint message needs some time to process before the merged flow actually reaches the firmware, during which we may get duplicate merge hints if there're more than one packet that hit the pre-merged flow. And processing duplicate merge hints will cost extra host_ctx's which are a limited resource. Avoid the duplicate merge by using hash table to store the sub_flows to be merged. Fixes: 8af56f40e53b ("nfp: flower: offload merge flows") Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethernet/netronome/nfp: Fix a use after free in nfp_bpf_ctrl_msg_rxLv Yunlong2021-03-291-0/+1
| | | | | | | | | | | | | In nfp_bpf_ctrl_msg_rx, if nfp_ccm_get_type(skb) == NFP_CCM_TYPE_BPF_BPF_EVENT is true, the skb will be freed. But the skb is still used by nfp_ccm_rx(&bpf->ccm, skb). My patch adds a return when the skb was freed. Fixes: bcf0cafab44fd ("nfp: split out common control message handling code") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch '100GbE' of ↵David S. Miller2021-03-2910-41/+86
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-03-29 This series contains updates to ice driver only. Ani does not fail on link/PHY errors during probe as this is not a fatal error to prevent the user from remedying the problem. He also corrects checking Wake on LAN support to be port number, not PF ID. Fabio increases the AdminQ timeout as some commands can take longer than the current value. Chinh fixes iSCSI to use be able to use port 860 by using information from DCBx and other QoS configuration info. Krzysztof fixes a possible race between ice_open() and ice_stop(). Bruce corrects the ordering of arguments in a memory allocation call. Dave removes DCBNL device reset bit which is blocking changes coming from DCBNL interface. Jacek adds error handling for filter allocation failure. Robert ensures memory is freed if VSI filter list issues are encountered. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * ice: Cleanup fltr list in case of allocation issuesRobert Malz2021-03-291-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When ice_remove_vsi_lkup_fltr is called, by calling ice_add_to_vsi_fltr_list local copy of vsi filter list is created. If any issues during creation of vsi filter list occurs it up for the caller to free already allocated memory. This patch ensures proper memory deallocation in these cases. Fixes: 80d144c9ac82 ("ice: Refactor switch rule management structures and functions") Signed-off-by: Robert Malz <robertx.malz@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * ice: Use port number instead of PF ID for WoLAnirudh Venkataramanan2021-03-293-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As per the spec, the WoL control word read from the NVM should be interpreted as port numbers, and not PF numbers. So when checking if WoL supported, use the port number instead of the PF ID. Also, ice_is_wol_supported doesn't really need a pointer to the pf struct, but just needs a pointer to the hw instance. Fixes: 769c500dcc1e ("ice: Add advanced power mgmt for WoL") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * ice: Fix for dereference of NULL pointerJacek Bułatek2021-03-291-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add handling of allocation fault for ice_vsi_list_map_info. Also *fi should not be NULL pointer, it is a reference to raw data field, so remove this variable and use the reference directly. Fixes: 9daf8208dd4d ("ice: Add support for switch filter programming") Signed-off-by: Jacek Bułatek <jacekx.bulatek@intel.com> Co-developed-by: Haiyue Wang <haiyue.wang@intel.com> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * ice: remove DCBNL_DEVRESET bit from PF stateDave Ertman2021-03-293-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original purpose of the ICE_DCBNL_DEVRESET was to protect the driver during DCBNL device resets. But, the flow for DCBNL device resets now consists of only calls up the stack such as dev_close() and dev_open() that will result in NDO calls to the driver. These will be handled with state changes from the stack. Also, there is a problem of the dev_close and dev_open being blocked by checks for reset in progress also using the ICE_DCBNL_DEVRESET bit. Since the ICE_DCBNL_DEVRESET bit is not necessary for protecting the driver from DCBNL device resets and it is actually blocking changes coming from the DCBNL interface, remove the bit from the PF state and don't block driver function based on DCBNL reset in progress. Fixes: b94b013eb626 ("ice: Implement DCBNL support") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * ice: fix memory allocation callBruce Allan2021-03-291-1/+1
| | | | | | | | | | | | | | | | | | | | Fix the order of number of array members and member size parameters in a *calloc() call. Fixes: b3c3890489f6 ("ice: avoid unnecessary single-member variable-length structs") Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * ice: prevent ice_open and ice_stop during resetKrzysztof Goreczny2021-03-293-2/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a possibility of race between ice_open or ice_stop calls performed by OS and reset handling routine both trying to modify VSI resources. Observed scenarios: - reset handler deallocates memory in ice_vsi_free_arrays and ice_open tries to access it in ice_vsi_cfg_txq leading to driver crash - reset handler deallocates memory in ice_vsi_free_arrays and ice_close tries to access it in ice_down leading to driver crash - reset handler clears port scheduler topology and sets port state to ICE_SCHED_PORT_STATE_INIT leading to ice_ena_vsi_txq fail in ice_open To prevent this additional checks in ice_open and ice_stop are introduced to make sure that OS is not allowed to alter VSI config while reset is in progress. Fixes: cdedef59deb0 ("ice: Configure VSIs for Tx/Rx") Signed-off-by: Krzysztof Goreczny <krzysztof.goreczny@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * ice: Recognize 860 as iSCSI port in CEE modeChinh T Cao2021-03-292-9/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iSCSI can use both TCP ports 860 and 3260. However, in our current implementation, the ice_aqc_opc_get_cee_dcb_cfg (0x0A07) AQ command doesn't provide a way to communicate the protocol port number to the AQ's caller. Thus, we assume that 3260 is the iSCSI port number at the AQ's caller layer. Rely on the dcbx-willing mode, desired QoS and remote QoS configuration to determine which port number that iSCSI will use. Fixes: 0ebd3ff13cca ("ice: Add code for DCB initialization part 2/4") Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * ice: Increase control queue timeoutFabio Pricoco2021-03-291-2/+2
| | | | | | | | | | | | | | | | | | | | 250 msec timeout is insufficient for some AQ commands. Advice from FW team was to increase the timeout. Increase to 1 second. Fixes: 7ec59eeac804 ("ice: Add support for control queues") Signed-off-by: Fabio Pricoco <fabio.pricoco@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * ice: Continue probe on link/PHY errorsAnirudh Venkataramanan2021-03-291-9/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | An incorrect NVM update procedure can result in the driver failing probe. In this case, the recommended resolution method is to update the NVM using the right procedure. However, if the driver fails probe, the user will not be able to update the NVM. So do not fail probe on link/PHY errors. Fixes: 1a3571b5938c ("ice: restore PHY settings on media insertion") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | cxgb4: avoid collecting SGE_QBASE regs during trafficRahul Lakkireddy2021-03-292-5/+21
| | | | | | | | | | | | | | | | | | | | | | Accessing SGE_QBASE_MAP[0-3] and SGE_QBASE_INDEX registers can lead to SGE missing doorbells under heavy traffic. So, only collect them when adapter is idle. Also update the regdump range to skip collecting these registers. Fixes: 80a95a80d358 ("cxgb4: collect SGE PF/VF queue map") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | gianfar: Handle error code at MAC address changeClaudiu Manoil2021-03-291-1/+5
| | | | | | | | | | | | | | | | Handle return error code of eth_mac_addr(); Fixes: 3d23a05c75c7 ("gianfar: Enable changing mac addr when if up") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ethernet: myri10ge: Fix a use after free in myri10ge_sw_tsoLv Yunlong2021-03-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | In myri10ge_sw_tso, the skb_list_walk_safe macro will set (curr) = (segs) and (next) = (curr)->next. If status!=0 is true, the memory pointed by curr and segs will be free by dev_kfree_skb_any(curr). But later, the segs is used by segs = segs->next and causes a uaf. As (next) = (curr)->next, my patch replaces seg->next to next. Fixes: 536577f36ff7a ("net: myri10ge: use skb_list_walk_safe helper for gso segments") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
* | mlxsw: spectrum: Fix ECN marking in tunnel decapsulationIdo Schimmel2021-03-293-8/+21
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | Cited commit changed the behavior of the software data path with regards to the ECN marking of decapsulated packets. However, the commit did not change other callers of __INET_ECN_decapsulate(), namely mlxsw. The driver is using the function in order to ensure that the hardware and software data paths act the same with regards to the ECN marking of decapsulated packets. The discrepancy was uncovered by commit 5aa3c334a449 ("selftests: forwarding: vxlan_bridge_1d: Fix vxlan ecn decapsulate value") that aligned the selftest to the new behavior. Without this patch the selftest passes when used with veth pairs, but fails when used with mlxsw netdevs. Fix this by instructing the device to propagate the ECT(1) mark from the outer header to the inner header when the inner header is ECT(0), for both NVE and IP-in-IP tunnels. A helper is added in order not to duplicate the code between both tunnel types. Fixes: b723748750ec ("tunnel: Propagate ECT(1) when decapsulating as recommended by RFC6040") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch '40GbE' of ↵David S. Miller2021-03-254-6/+16
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-03-25 This series contains updates to virtchnl header file and i40e driver. Norbert removes added padding from virtchnl RSS structures as this causes issues when iterating over the arrays. Mateusz adds Asym_Pause as supported to allow these settings to be set as the hardware supports it. Eryk fixes an issue where encountering a VF reset alongside releasing VFs could cause a call trace. Arkadiusz moves TC setup before resource setup as previously it was possible to enter with a null q_vector causing a kernel oops. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * i40e: Fix oops at i40e_rebuild()Arkadiusz Kubalewski2021-03-251-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Setup TC before the i40e_setup_pf_switch() call. Memory must be initialized for all the queues before using its resources. Previously it could be possible that a call: xdp_rxq_info_reg(&rx_ring->xdp_rxq, rx_ring->netdev, rx_ring->queue_index, rx_ring->q_vector->napi.napi_id); was made with q_vector being null. Oops could show up with the following sequence: - no driver loaded - FW LLDP agent is on (flag disable-fw-lldp:off) - link is up - DCB configured with number of Traffic Classes that will not divide completely the default number of queues (usually cpu cores) - driver load - set private flag: disable-fw-lldp:on Fixes: 4b208eaa8078 ("i40e: Add init and default config of software based DCB") Fixes: b02e5a0ebb17 ("xsk: Propagate napi_id to XDP socket Rx path") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * i40e: Fix kernel oops when i40e driver removes VF'sEryk Rybak2021-03-252-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the reason of kernel oops when i40e driver removed VFs. Added new __I40E_VFS_RELEASING state to signalize releasing process by PF, that it makes possible to exit of reset VF procedure. Without this patch, it is possible to suspend the VFs reset by releasing VFs resources procedure. Retrying the reset after the timeout works on the freed VF memory causing a kernel oops. Fixes: d43d60e5eb95 ("i40e: ensure reset occurs when disabling VF") Signed-off-by: Eryk Rybak <eryk.roch.rybak@intel.com> Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * i40e: Added Asym_Pause to supported link modesMateusz Palczewski2021-03-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add Asym_Pause to supported link modes (it is supported by HW). Lack of Asym_Pause in supported modes can cause several problems, i.e. it won't be possible to turn the autonegotiation on with asymmetric pause settings (i.e. Tx on, Rx off). Fixes: 4e91bcd5d47a ("i40e: Finish implementation of ethtool get settings") Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | amd-xgbe: Update DMA coherency valuesShyam Sundar S K2021-03-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | Based on the IOMMU configuration, the current cache control settings can result in possible coherency issues. The hardware team has recommended new settings for the PCI device path to eliminate the issue. Fixes: 6f595959c095 ("amd-xgbe: Adjust register settings to improve performance") Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>