summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/intel
Commit message (Collapse)AuthorAgeFilesLines
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2021-12-168-26/+36
|\ | | | | | | | | | | No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * ixgbe: set X550 MDIO speed before talking to PHYCyril Novikov2021-12-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The MDIO bus speed must be initialized before talking to the PHY the first time in order to avoid talking to it using a speed that the PHY doesn't support. This fixes HW initialization error -17 (IXGBE_ERR_PHY_ADDR_INVALID) on Denverton CPUs (a.k.a. the Atom C3000 family) on ports with a 10Gb network plugged in. On those devices, HLREG0[MDCSPD] resets to 1, which combined with the 10Gb network results in a 24MHz MDIO speed, which is apparently too fast for the connected PHY. PHY register reads over MDIO bus return garbage, leading to initialization failure. Reproduced with Linux kernel 4.19 and 5.15-rc7. Can be reproduced using the following setup: * Use an Atom C3000 family system with at least one X552 LAN on the SoC * Disable PXE or other BIOS network initialization if possible (the interface must not be initialized before Linux boots) * Connect a live 10Gb Ethernet cable to an X550 port * Power cycle (not reset, doesn't always work) the system and boot Linux * Observe: ixgbe interfaces w/ 10GbE cables plugged in fail with error -17 Fixes: e84db7272798 ("ixgbe: Introduce function to control MDIO speed") Signed-off-by: Cyril Novikov <cnovikov@lynx.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * ixgbe: Document how to enable NBASE-T supportRobert Schlabbach2021-12-151-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a296d665eae1 ("ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support") introduced suppression of the advertisement of NBASE-T speeds by default, according to Todd Fujinaka to accommodate customers with network switches which could not cope with advertised NBASE-T speeds, as posted in the E1000-devel mailing list: https://sourceforge.net/p/e1000/mailman/message/37106269/ However, the suppression was not documented at all, nor was how to enable NBASE-T support. Properly document the NBASE-T suppression and how to enable NBASE-T support. Fixes: a296d665eae1 ("ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support") Reported-by: Robert Schlabbach <robert_s@gmx.net> Signed-off-by: Robert Schlabbach <robert_s@gmx.net> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * igc: Fix typo in i225 LTR functionsSasha Neftin2021-12-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The LTR maximum value was incorrectly written using the scale from the LTR minimum value. This would cause incorrect values to be sent, in cases where the initial calculation lead to different min/max scales. Fixes: 707abf069548 ("igc: Add initial LTR support") Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * igbvf: fix double free in `igbvf_probe`Letu Ren2021-12-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In `igbvf_probe`, if register_netdev() fails, the program will go to label err_hw_init, and then to label err_ioremap. In free_netdev() which is just below label err_ioremap, there is `list_for_each_entry_safe` and `netif_napi_del` which aims to delete all entries in `dev->napi_list`. The program has added an entry `adapter->rx_ring->napi` which is added by `netif_napi_add` in igbvf_alloc_queues(). However, adapter->rx_ring has been freed below label err_hw_init. So this a UAF. In terms of how to patch the problem, we can refer to igbvf_remove() and delete the entry before `adapter->rx_ring`. The KASAN logs are as follows: [ 35.126075] BUG: KASAN: use-after-free in free_netdev+0x1fd/0x450 [ 35.127170] Read of size 8 at addr ffff88810126d990 by task modprobe/366 [ 35.128360] [ 35.128643] CPU: 1 PID: 366 Comm: modprobe Not tainted 5.15.0-rc2+ #14 [ 35.129789] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 35.131749] Call Trace: [ 35.132199] dump_stack_lvl+0x59/0x7b [ 35.132865] print_address_description+0x7c/0x3b0 [ 35.133707] ? free_netdev+0x1fd/0x450 [ 35.134378] __kasan_report+0x160/0x1c0 [ 35.135063] ? free_netdev+0x1fd/0x450 [ 35.135738] kasan_report+0x4b/0x70 [ 35.136367] free_netdev+0x1fd/0x450 [ 35.137006] igbvf_probe+0x121d/0x1a10 [igbvf] [ 35.137808] ? igbvf_vlan_rx_add_vid+0x100/0x100 [igbvf] [ 35.138751] local_pci_probe+0x13c/0x1f0 [ 35.139461] pci_device_probe+0x37e/0x6c0 [ 35.165526] [ 35.165806] Allocated by task 366: [ 35.166414] ____kasan_kmalloc+0xc4/0xf0 [ 35.167117] foo_kmem_cache_alloc_trace+0x3c/0x50 [igbvf] [ 35.168078] igbvf_probe+0x9c5/0x1a10 [igbvf] [ 35.168866] local_pci_probe+0x13c/0x1f0 [ 35.169565] pci_device_probe+0x37e/0x6c0 [ 35.179713] [ 35.179993] Freed by task 366: [ 35.180539] kasan_set_track+0x4c/0x80 [ 35.181211] kasan_set_free_info+0x1f/0x40 [ 35.181942] ____kasan_slab_free+0x103/0x140 [ 35.182703] kfree+0xe3/0x250 [ 35.183239] igbvf_probe+0x1173/0x1a10 [igbvf] [ 35.184040] local_pci_probe+0x13c/0x1f0 Fixes: d4e0fe01a38a0 (igbvf: add new driver to support 82576 virtual functions) Reported-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Letu Ren <fantasquex@gmail.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * igb: Fix removal of unicast MAC filters of VFsKaren Sornek2021-12-151-14/+14
| | | | | | | | | | | | | | | | | | | | | | Move checking condition of VF MAC filter before clearing or adding MAC filter to VF to prevent potential blackout caused by removal of necessary and working VF's MAC filter. Fixes: 1b8b062a99dc ("igb: add VF trust infrastructure") Signed-off-by: Karen Sornek <karen.sornek@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * ice: Don't put stale timestamps in the skbKarol Kolacinski2021-12-142-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The driver has to check if it does not accidentally put the timestamp in the SKB before previous timestamp gets overwritten. Timestamp values in the PHY are read only and do not get cleared except at hardware reset or when a new timestamp value is captured. The cached_tstamp field is used to detect the case where a new timestamp has not yet been captured, ensuring that we avoid sending stale timestamp data to the stack. Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * ice: Use div64_u64 instead of div_u64 in adjfineKarol Kolacinski2021-12-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | Change the division in ice_ptp_adjfine from div_u64 to div64_u64. div_u64 is used when the divisor is 32 bit but in this case incval is 64 bit and it caused incorrect calculations and incval adjustments. Fixes: 06c16d89d2cb ("ice: register 1588 PTP clock device object for E810 devices") Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * iavf: do not override the adapter state in the watchdog task (again)Stefan Assmann2021-12-131-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The watchdog task incorrectly changes the state to __IAVF_RESETTING, instead of letting the reset task take care of that. This was already resolved by commit 22c8fd71d3a5 ("iavf: do not override the adapter state in the watchdog task") but the problem was reintroduced by the recent code refactoring in commit 45eebd62999d ("iavf: Refactor iavf state machine tracking"). Fixes: 45eebd62999d ("iavf: Refactor iavf state machine tracking") Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * iavf: missing unlocks in iavf_watchdog_task()Dan Carpenter2021-12-131-2/+2
| | | | | | | | | | | | | | | | | | This code was re-organized and there some unlocks missing now. Fixes: 898ef1cb1cb2 ("iavf: Combine init and watchdog state machines") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: use modern kernel API for kickJesse Brandeburg2021-12-151-4/+5
| | | | | | | | | | | | | | | | | | The kernel gained a new interface for drivers to use to combine tail bump (doorbell) and BQL updates, attempt to use those new interfaces. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: tighter control over VSI_DOWN stateJesse Brandeburg2021-12-152-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The driver had comments to the effect of: This flag should be set before calling this function. While reviewing code it was found that there were several violations of this policy, which could introduce hard to find bugs or races. Fix the violations of the "VSI DOWN state must be set before calling ice_down" and make checking the state into code with a WARN_ON. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: use prefetch methodsJesse Brandeburg2021-12-151-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel provides some prefetch mechanisms to speed up commonly cold cache line accesses during receive processing. Since these are software structures it helps to have these strategically placed prefetches. Be careful to call BQL prefetch complete only for non XDP queues. Co-developed-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: update to newer kernel APIJesse Brandeburg2021-12-151-9/+9
| | | | | | | | | | | | | | | | Use the netif_tx_* API from netdevice.h which has simpler parameters. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: support immediate firmware activation via devlink reloadJacob Keller2021-12-1510-27/+308
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ice hardware contains an embedded chip with firmware which can be updated using devlink flash. The firmware which runs on this chip is referred to as the Embedded Management Processor firmware (EMP firmware). Activating the new firmware image currently requires that the system be rebooted. This is not ideal as rebooting the system can cause unwanted downtime. In practical terms, activating the firmware does not always require a full system reboot. In many cases it is possible to activate the EMP firmware immediately. There are a couple of different scenarios to cover. * The EMP firmware itself can be reloaded by issuing a special update to the device called an Embedded Management Processor reset (EMP reset). This reset causes the device to reset and reload the EMP firmware. * PCI configuration changes are only reloaded after a cold PCIe reset. Unfortunately there is no generic way to trigger this for a PCIe device without a system reboot. When performing a flash update, firmware is capable of responding with some information about the specific update requirements. The driver updates the flash by programming a secondary inactive bank with the contents of the new image, and then issuing a command to request to switch the active bank starting from the next load. The response to the final command for updating the inactive NVM flash bank includes an indication of the minimum reset required to fully update the device. This can be one of the following: * A full power on is required * A cold PCIe reset is required * An EMP reset is required The response to the command to switch flash banks includes an indication of whether or not the firmware will allow an EMP reset request. For most updates, an EMP reset is sufficient to load the new EMP firmware without issues. In some cases, this reset is not sufficient because the PCI configuration space has changed. When this could cause incompatibility with the new EMP image, the firmware is capable of rejecting the EMP reset request. Add logic to ice_fw_update.c to handle the response data flash update AdminQ commands. For the reset level, issue a devlink status notification informing the user of how to complete the update with a simple suggestion like "Activate new firmware by rebooting the system". Cache the status of whether or not firmware will restrict the EMP reset for use in implementing devlink reload. Implement support for devlink reload with the "fw_activate" flag. This allows user space to request the firmware be activated immediately. For the .reload_down handler, we will issue a request for the EMP reset using the appropriate firmware AdminQ command. If we know that the firmware will not allow an EMP reset, simply exit with a suitable netlink extended ACK message indicating that the EMP reset is not available. For the .reload_up handler, simply wait until the driver has finished resetting. Logic to handle processing of an EMP reset already exists in the driver as part of its reset and rebuild flows. Implement support for the devlink reload interface with the "fw_activate" action. This allows userspace to request activation of firmware without a reboot. Note that support for indicating the required reset and EMP reset restriction is not supported on old versions of firmware. The driver can determine if the two features are supported by checking the device capabilities report. I confirmed support has existed since at least version 5.5.2 as reported by the 'fw.mgmt' version. Support to issue the EMP reset request has existed in all version of the EMP firmware for the ice hardware. Check the device capabilities report to determine whether or not the indications are reported by the running firmware. If the reset requirement indication is not supported, always assume a full power on is necessary. If the reset restriction capability is not supported, always assume the EMP reset is available. Users can verify if the EMP reset has activated the firmware by using the devlink info report to check that the 'running' firmware version has updated. For example a user might do the following: # Check current version $ devlink dev info # Update the device $ devlink dev flash pci/0000:af:00.0 file firmware.bin # Confirm stored version updated $ devlink dev info # Reload to activate new firmware $ devlink dev reload pci/0000:af:00.0 action fw_activate # Confirm running version updated $ devlink dev info Finally, this change does *not* implement basic driver-only reload support. I did look into trying to do this. However, it requires significant refactor of how the ice driver probes and loads everything. The ice driver probe and allocation flows were not designed with such a reload in mind. Refactoring the flow to support this is beyond the scope of this change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: reduce time to read Option ROM CIVD dataJacob Keller2021-12-151-12/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During probe and device reset, the ice driver reads some data from the NVM image as part of ice_init_nvm. Part of this data includes a section of the Option ROM which contains version information. The function ice_get_orom_civd_data is used to locate the '$CIV' data section of the Option ROM. Timing of ice_probe and ice_rebuild indicate that the ice_get_orom_civd_data function takes about 10 seconds to finish executing. The function locates the section by scanning the Option ROM every 512 bytes. This requires a significant number of NVM read accesses, since the Option ROM bank is 500KB. In the worst case it would take about 1000 reads. Worse, all PFs serialize this operation during reload because of acquiring the NVM semaphore. The CIVD section is located at the end of the Option ROM image data. Unfortunately, the driver has no easy method to determine the offset manually. Practical experiments have shown that the data could be at a variety of locations, so simply reversing the scanning order is not sufficient to reduce the overall read time. Instead, copy the entire contents of the Option ROM into memory. This allows reading the data using 4Kb pages instead of 512 bytes at a time. This reduces the total number of firmware commands by a factor of 8. In addition, reading the whole section together at once allows better indication to firmware of when we're "done". Re-write ice_get_orom_civd_data to allocate virtual memory to store the Option ROM data. Copy the entire OptionROM contents at once using ice_read_flash_module. Finally, use this memory copy to scan for the '$CIV' section. This change significantly reduces the time to read the Option ROM CIVD section from ~10 seconds down to ~1 second. This has a significant impact on the total time to complete a driver rebuild or probe. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: move ice_devlink_flash_update and merge with ice_flash_pldm_imageJacob Keller2021-12-153-72/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ice_devlink_flash_update function performs a few upfront checks and then calls ice_flash_pldm_image. Most if these checks make more sense in the context of code within ice_flash_pldm_image. Merge ice_devlink_flash_update and ice_flash_pldm_image into one function, placing it in ice_fw_update.c Since this is still the entry point for devlink, call the function ice_devlink_flash_update instead of ice_flash_pldm_image. This leaves a single function which handles the devlink parameters and then initiates a PLDM update. With this change, the ice_devlink_flash_update function in ice_fw_update.c becomes the main entry point for flash update. It elimintes some unnecessary boiler plate code between the two previous functions. The ultimate motivation for this is that it eases supporting a dry run with the PLDM library in a future change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: move and rename ice_check_for_pending_updateJacob Keller2021-12-153-77/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ice_devlink_flash_update function performs a few checks and then calls ice_flash_pldm_image. One of these checks is to call ice_check_for_pending_update. This function checks if the device has a pending update, and cancels it if so. This is necessary to allow a new flash update to proceed. We want to refactor the ice code to eliminate ice_devlink_flash_update, moving its checks into ice_flash_pldm_image. To do this, ice_check_for_pending_update will become static, and only called by ice_flash_pldm_image. To make this change easier to review, first just move the function up within the ice_fw_update.c file. While at it, note that the function has a misleading name. Its primary action is to cancel a pending update. Using the verb "check" does not imply this. Rename it to ice_cancel_pending_update. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: devlink: add shadow-ram region to snapshot Shadow RAMJacob Keller2021-12-152-5/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a region for reading the contents of the NVM flash as a snapshot. This region does not allow reading the Shadow RAM, as it always passes the FLASH_ONLY bit to the low level firmware interface. Add a separate shadow-ram region which will allow snapshot of the current contents of the Shadow RAM. This data is built from the NVM contents but is distinct as the device builds up the Shadow RAM during initialization, so being able to snapshot its contents can be useful when attempting to debug flash related issues. Fix the comment description of the nvm-flash region which incorrectly stated that it filled the shadow-ram region, and add a comment explaining that the nvm-flash region does not actually read the Shadow RAM. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: Remove unused ICE_FLOW_SEG_HDRS_L2_MASKTony Nguyen2021-12-141-2/+0
| | | | | | | | | | | | | | | | | | Remove the unused define ICE_FLOW_SEG_HDRS_L2_MASK. Reported-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Gurucharan G <gurucharanx.g@intel.com>
* | ice: Remove unnecessary castsDan Carpenter2021-12-141-4/+2
| | | | | | | | | | | | | | | | | | The "bitmap" variable is already an unsigned long so there is no need for this cast. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: Propagate error codesTony Nguyen2021-12-148-99/+45
| | | | | | | | | | | | | | | | As all functions now return standard error codes, propagate the values being returned instead of converting them to generic values. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com>
* | ice: Remove excess error variablesTony Nguyen2021-12-1410-282/+223
| | | | | | | | | | | | | | | | | | | | | | | | ice_status previously had a variable to contain these values where other error codes had a variable as well. With ice_status now being an int, there is no need for two variables to hold error values. In cases where this occurs, remove one of the excess variables and use a single one. Some initialization of variables are no longer needed and have been removed. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com>
* | ice: Cleanup after ice_status removalTony Nguyen2021-12-1428-350/+265
| | | | | | | | | | | | | | | | Clean up code after changing ice_status to int. Rearrange to fix reverse Christmas tree and pull lines up where applicable. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com>
* | ice: Remove enum ice_statusTony Nguyen2021-12-1422-654/+568
| | | | | | | | | | | | | | | | | | Replace uses of ice_status to, as equivalent as possible, error codes. Remove enum ice_status and its helper conversion function as they are no longer needed. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com>
* | ice: Use int for ice_statusTony Nguyen2021-12-1433-859/+863
| | | | | | | | | | | | | | | | | | To prepare for removal of ice_status, change the variables from ice_status to int. This eases the transition when values are changed to return standard int error codes over enum ice_status. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com>
* | ice: Remove string printing for ice_statusTony Nguyen2021-12-149-236/+163
| | | | | | | | | | | | | | | | | | Remove the ice_stat_str() function which prints the string representation of the ice_status error code. With upcoming changes moving away from ice_status, there will be no need for this function. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com>
* | ice: Refactor status flow for DDP loadWojciech Drewek2021-12-144-182/+272
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this change, final state of the DDP pkg load process was dependent on many variables such as: ice_status, pkg version, ice_aq_err. The last one had be stored in hw->pkg_dwnld_status. It was impossible to conclude this state just from ice_status, that's why logging process of DDP pkg load in the caller was a little bit complicated. With this patch new status enum is introduced - ice_ddp_state. It covers all the possible final states of the loading process. What's tricky for ice_ddp_state is that not only ICE_DDP_PKG_SUCCESS(=0) means that load was successful. Actually three states mean that: - ICE_DDP_PKG_SUCCESS - ICE_DDP_PKG_SAME_VERSION_ALREADY_LOADED - ICE_DDP_PKG_COMPATIBLE_ALREADY_LOADED ice_is_init_pkg_successful can tell that information. One ddp_state should not be used outside of ice_init_pkg which is ICE_DDP_PKG_ALREADY_LOADED. It is more generic, it is used in ice_dwnld_cfg_bufs to see if pkg is already loaded. At this point we can't use one of the specific one (SAME_VERSION, COMPATIBLE, NOT_SUPPORTED) because we don't have information on the package currently loaded in HW (we are before calling ice_get_pkg_info). We can get rid of hw->pkg_dwnld_status because we are immediately mapping aq errors to ice_ddp_state in ice_dwnld_cfg_bufs. Other errors like ICE_ERR_NO_MEMORY, ICE_ERR_PARAM are mapped the generic ICE_DDP_PKG_ERR. Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: Refactor promiscuous functionsBrett Creeley2021-12-144-102/+156
| | | | | | | | | | | | | | | | | | | | | | Some of the promiscuous mode functions take a boolean to indicate set/clear, which affects readability. Refactor and provide an interface for the promiscuous mode code with explicit set and clear promiscuous mode operations. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: refactor PTYPE validatingJeff Guo2021-12-144-372/+133
| | | | | | | | | | | | | | | | | | | | | | Since the capability of a PTYPE within a specific package could be negotiated by checking the HW bit map, it means that there's no need to maintain a different PTYPE list for each type of the package when parsing PTYPE. So refactor the PTYPE validating mechanism. Signed-off-by: Jeff Guo <jia.guo@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | ice: Add package PTYPE enable informationHaiyue Wang2021-12-144-0/+98
| | | | | | | | | | | | | | | | | | Scan the 'Marker Ptype TCAM' section to retrieve the Rx parser PTYPE enable information from the current package. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | net_tstamp: add new flag HWTSTAMP_FLAG_BONDED_PHC_INDEXHangbin Liu2021-12-146-24/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 94dd016ae538 ("bond: pass get_ts_info and SIOC[SG]HWTSTAMP ioctl to active device") the user could get bond active interface's PHC index directly. But when there is a failover, the bond active interface will change, thus the PHC index is also changed. This may break the user's program if they did not update the PHC timely. This patch adds a new hwtstamp_config flag HWTSTAMP_FLAG_BONDED_PHC_INDEX. When the user wants to get the bond active interface's PHC, they need to add this flag and be aware the PHC index may be changed. With the new flag. All flag checks in current drivers are removed. Only the checking in net_hwtstamp_validate() is kept. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2021-12-0914-85/+165
|\| | | | | | | | | | | No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * Merge branch '100GbE' of ↵Jakub Kicinski2021-12-089-47/+74
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-12-08 Yahui adds re-initialization of Flow Director for VF reset. Paul restores interrupts when enabling VFs. Dave re-adds bandwidth check for DCBNL and moves DSCP mode check earlier in the function. Jesse prevents reporting of dropped packets that occur during initialization and fixes reporting of statistics which could occur with frequent reads. Michal corrects setting of protocol type for UDP header and fixes lack of differentiation when adding filters for tunnels. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: safer stats processing ice: fix adding different tunnels ice: fix choosing UDP header type ice: ignore dropped packets during init ice: Fix problems with DSCP QoS implementation ice: rearm other interrupt cause register after enabling VFs ice: fix FDIR init missing when reset VF ==================== Link: https://lore.kernel.org/r/20211208211144.2629867-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * ice: safer stats processingJesse Brandeburg2021-12-081-11/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The driver was zeroing live stats that could be fetched by ndo_get_stats64 at any time. This could result in inconsistent statistics, and the telltale sign was when reading stats frequently from /proc/net/dev, the stats would go backwards. Fix by collecting stats into a local, and delaying when we write to the structure so it's not incremental. Fixes: fcea6f3da546 ("ice: Add stats and ethtool support") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * ice: fix adding different tunnelsMichal Swiatkowski2021-12-076-13/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding filters with the same values inside for VXLAN and Geneve causes HW error, because it looks exactly the same. To choose between different type of tunnels new recipe is needed. Add storing tunnel types in creating recipes function and start checking it in finding function. Change getting open tunnels function to return port on correct tunnel type. This is needed to copy correct port to dummy packet. Block user from adding enc_dst_port via tc flower, because VXLAN and Geneve filters can be created only with destination port which was previously opened. Fixes: 8b032a55c1bd5 ("ice: low level support for tunnels") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * ice: fix choosing UDP header typeMichal Swiatkowski2021-12-071-17/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In tunnels packet there can be two UDP headers: - outer which for hw should be mark as ICE_UDP_OF - inner which for hw should be mark as ICE_UDP_ILOS or as ICE_TCP_IL if inner header is of TCP type In none tunnels packet header can be: - UDP, which for hw should be mark as ICE_UDP_ILOS - TCP, which for hw should be mark as ICE_TCP_IL Change incorrect ICE_UDP_OF for none tunnel packets to ICE_UDP_ILOS. ICE_UDP_OF is incorrect for none tunnel packets and setting it leads to error from hw while adding this kind of recipe. In summary, for tunnel outer port type should always be set to ICE_UDP_OF, for none tunnel outer and tunnel inner it should always be set to ICE_UDP_ILOS. Fixes: 9e300987d4a8 ("ice: VXLAN and Geneve TC support") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * ice: ignore dropped packets during initJesse Brandeburg2021-12-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the hardware is constantly receiving unicast or broadcast packets during driver load, the device previously counted many GLV_RDPC (VSI dropped packets) events during init. This causes confusing dropped packet statistics during driver load. The dropped packets counter incrementing does stop once the driver finishes loading. Avoid this problem by baselining our statistics at the end of driver open instead of the end of probe. Fixes: cdedef59deb0 ("ice: Configure VSIs for Tx/Rx") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * ice: Fix problems with DSCP QoS implementationDave Ertman2021-12-071-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch that implemented DSCP QoS implementation removed a bandwidth check that was used to check for a specific condition caused by some corner cases. This check should not of been removed. The same patch also added a check for when the DCBx state could be changed in relation to DSCP, but the check was erroneously added nested in a check for CEE mode, which made the check useless. Fix these problems by re-adding the bandwidth check and relocating the DSCP mode check earlier in the function that changes DCBx state in the driver. Fixes: 2a87bd73e50d ("ice: Add DSCP support") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * ice: rearm other interrupt cause register after enabling VFsPaul Greenwalt2021-12-071-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The other interrupt cause register (OICR), global interrupt 0, is disabled when enabling VFs to prevent handling VFLR. If the OICR is not rearmed then the VF cannot communicate with the PF. Rearm the OICR after enabling VFs. Fixes: 916c7fdf5e93 ("ice: Separate VF VSI initialization/creation from reset flow") Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * ice: fix FDIR init missing when reset VFYahui Cao2021-12-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When VF is being reset, ice_reset_vf() will be called and FDIR resource should be released and initialized again. Fixes: 1f7ea1cd6a37 ("ice: Enable FDIR Configure for AVF") Signed-off-by: Yahui Cao <yahui.cao@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | i40e: Fix NULL pointer dereference in i40e_dbg_dump_descNorbert Zulinski2021-12-061-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When trying to dump VFs VSI RX/TX descriptors using debugfs there was a crash due to NULL pointer dereference in i40e_dbg_dump_desc. Added a check to i40e_dbg_dump_desc that checks if VSI type is correct for dumping RX/TX descriptors. Fixes: 02e9c290814c ("i40e: debugfs interface") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | i40e: Fix pre-set max number of queues for VFMateusz Palczewski2021-12-061-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After setting pre-set combined to 16 queues and reserving 16 queues by tc qdisc, pre-set maximum combined queues returned to default value after VF reset being 4 and this generated errors during removing tc. Fixed by removing clear num_req_queues before reset VF. Fixes: e284fc280473 (i40e: Add and delete cloud filter) Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Bindushree P <Bindushree.p@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | i40e: Fix failed opcode appearing if handling messages from VFKaren Sornek2021-12-062-22/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix failed operation code appearing if handling messages from VF. Implemented by waiting for VF appropriate state if request starts handle while VF reset. Without this patch the message handling request while VF is in a reset state ends with error -5 (I40E_ERR_PARAM). Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Karen Sornek <karen.sornek@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | iavf: Fix reporting when setting descriptor countMichal Maloszewski2021-12-061-11/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iavf_set_ringparams doesn't communicate to the user that 1. The user requested descriptor count is out of range. Instead it just quietly sets descriptors to the "clamped" value and calls it done. This makes it look an invalid value was successfully set as the descriptor count when this isn't actually true. 2. The user provided descriptor count needs to be inflated for alignment reasons. This behavior is confusing. The ice driver has already addressed this by rejecting invalid values for descriptor count and messaging for alignment adjustments. Do the same thing here by adding the error and info messages. Fixes: fbb7ddfef253 ("i40evf: core ethtool functionality") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | iavf: restore MSI state on resetMitch Williams2021-12-011-0/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the PF experiences an FLR, the VF's MSI and MSI-X configuration will be conveniently and silently removed in the process. When this happens, reset recovery will appear to complete normally but no traffic will pass. The netdev watchdog will helpfully notify everyone of this issue. To prevent such public embarrassment, restore MSI configuration at every reset. For normal resets, this will do no harm, but for VF resets resulting from a PF FLR, this will keep the VF working. Fixes: 5eae00c57f5e ("i40evf: main driver core") Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2021-12-021-0/+1
|\| | | | | | | Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * ice: xsk: clear status_error0 for each allocated descMaciej Fijalkowski2021-11-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a bug in which the receiving of packets can stop in the zero-copy driver. Ice HW ignores 3 lower bits from QRX_TAIL register, which means that tail is bumped only on intervals of 8. Currently with XSK RX batching in place, ice_alloc_rx_bufs_zc() clears the status_error0 only of the last descriptor that has been allocated/taken from the XSK buffer pool. status_error0 includes DD bit that is looked upon by the ice_clean_rx_irq_zc() to tell if a descriptor can be processed. The bug can be triggered when driver updates the ntu but not the QRX_TAIL, so HW wouldn't have a chance to write to the ready descriptors. Later on driver moves the ntc to the mentioned set of descriptors and interprets them as a ready to be processed, since corresponding DD bits were not cleared nor any writeback has happened that would clear it. This can then lead to ntc == ntu case which means that ring is empty and no further packet processing. Fix the XSK traffic hang that can be observed when l2fwd scenario from xdpsock is used by making sure that status_error0 is cleared for each descriptor that is fed to HW and therefore we are sure that driver will not processed non-valid DD bits. This will also prevent the driver from processing the descriptors that were allocated in favor of the previously processed ones, but writeback didn't happen yet. Fixes: db804cfc21e9 ("ice: Use the xsk batched rx allocation interface") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch '40GbE' of ↵David S. Miller2021-12-015-44/+69
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 40GbE Intel Wired LAN Driver Updates 2021-11-30 This series contains updates to iavf driver only. Patryk adds a debug message when MTU is changed. Grzegorz adds messaging when transitioning in and out of multicast promiscuous mode. Jake returns correct error codes for iavf_parse_cls_flower(). Jedrzej adds messaging for when the driver is removed and refactors struct usage to take less memory. He also adjusts ethtool statistics to only display information on active queues. Tony allows for user to specify the RSS hash. Karen resolves some static analysis warnings, corrects format specifiers, and rewords a message to come across as informational. v2: - Dropped patch 1 (for net) and 5 - Change MTU message from info to debug ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | iavf: Fix displaying queue statistics shown by ethtoolJedrzej Jagielski2021-11-301-11/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Driver provided too many lines as an output to ethtool -S command. Return actual length of string set of ethtool stats. Instead of predefined maximal value use the actual value on netdev, iterate over active queues. Without this patch, ethtool -S report would produce additional erroneous lines of queues that are not configured. Signed-off-by: Witold Fijalkowski <witoldx.fijalkowski@intel.com> Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>