diff options
author | David S. Miller <davem@davemloft.net> | 2023-09-16 12:00:56 +0100 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2023-09-16 12:00:56 +0100 |
commit | b6a7eeb44a6a8b1ea2dea78b5de448d86eb0bd96 (patch) | |
tree | 33c71b39157a14fc690043bf9a4381f8d51f8abd /Documentation/networking | |
parent | 2fa6175d8b2088345caae07151fa87b94e0986e8 (diff) | |
parent | a251eee62133774cf35ff829041377e721ef9c8c (diff) | |
download | linux-stable-b6a7eeb44a6a8b1ea2dea78b5de448d86eb0bd96.tar.gz linux-stable-b6a7eeb44a6a8b1ea2dea78b5de448d86eb0bd96.tar.bz2 linux-stable-b6a7eeb44a6a8b1ea2dea78b5de448d86eb0bd96.zip |
Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Introduce Intel IDPF driver
Pavan Kumar Linga says:
This patch series introduces the Intel Infrastructure Data Path Function
(IDPF) driver. It is used for both physical and virtual functions. Except
for some of the device operations the rest of the functionality is the
same for both PF and VF. IDPF uses virtchnl version2 opcodes and
structures defined in the virtchnl2 header file which helps the driver
to learn the capabilities and register offsets from the device
Control Plane (CP) instead of assuming the default values.
The format of the series follows the driver init flow to interface open.
To start with, probe gets called and kicks off the driver initialization
by spawning the 'vc_event_task' work queue which in turn calls the
'hard reset' function. As part of that, the mailbox is initialized which
is used to send/receive the virtchnl messages to/from the CP. Once that is
done, 'core init' kicks in which requests all the required global resources
from the CP and spawns the 'init_task' work queue to create the vports.
Based on the capability information received, the driver creates the said
number of vports (one or many) where each vport is associated to a netdev.
Also, each vport has its own resources such as queues, vectors etc.
From there, rest of the netdev_ops and data path are added.
IDPF implements both single queue which is traditional queueing model
as well as split queue model. In split queue model, it uses separate queue
for both completion descriptors and buffers which helps to implement
out-of-order completions. It also helps to implement asymmetric queues,
for example multiple RX completion queues can be processed by a single
RX buffer queue and multiple TX buffer queues can be processed by a
single TX completion queue. In single queue model, same queue is used
for both descriptor completions as well as buffer completions. It also
supports features such as generic checksum offload, generic receive
offload (hardware GRO) etc.
---
v7:
Patch 2:
* removed pci_[disable|enable]_pcie_error_reporting as they are dropped
from the core
Patch 4, 9:
* used 'kasprintf' instead of 'snprintf' to avoid providing explicit
character string size which also fixes "-Wformat-truncation" warnings
Patch 14:
* used 'ethtool_sprintf' instead of 'snprintf' to avoid providing explicit
character string size which also fixes "-Wformat-truncation" warning
* add string format argument to the 'ethtool_sprintf' to avoid warning on
"-Wformat-security"
v6: https://lore.kernel.org/netdev/20230825235954.894050-1-pavan.kumar.linga@intel.com/
Note: 'Acked-by' was only added to patches 1, 2, 12 and not to the other
patches because of the changes in v6
Patch 3, 4, 5, 6, 7, 8, 9, 11, 13, 14, 15:
* renamed 'reset_lock' to 'vport_ctrl_lock' to reflect the lock usage
* to avoid defensive programming, used 'vport_ctrl_lock' for the user
callbacks that access the 'vport' to prevent the hardware reset thread
from releasing the 'vport', when the user callback is in progress
* added some variables to netdev private structure to avoid vport access
if possible from ethtool and ndo callbacks
* moved 'mac_filter_list_lock' and MAC related flags to vport_config
structure and refactored mac filter flow to handle asynchronous
ndo mac filter callbacks
* stop the queues before starting the reset flow to avoid TX hangs
* removed 'sw_mutex' and 'stop_mutex' as they are not needed anymore
* added missing clear bit in 'init_task' error path
* renamed labels appropriately
Patch 8:
* replaced page_pool_put_page with page_pool_put_full_page
* for the page pool max_len, used PAGE_SIZE
Patch 10, 11, 13:
* made use of the 'netif_txq_maybe_stop', '__netif_txq_completed_wake'
helper macros
Patch 13:
* removed IDPF_HR_RESET_IN_PROG flag check in idpf_tx_singleq_start
as it is defensive
Patch 14:
* removed max descriptor check as the core does that
* removed unnecessary error messages
* removed the stats that are common between the ones reported by ethtool
and ip link
* replaced snprintf with ethtool_sprintf
* added a comment to explain the reason for the max queue check
* as the netdev queues are set on alloc, there is no need to set
them again on reset unless there is a queue change, so move the
'idpf_set_real_num_queues' to 'idpf_initiate_soft_reset'
Patch 15:
* reworded the 'configure SRIOV' in the commit message
v5: https://lore.kernel.org/netdev/20230816004305.216136-1-anthony.l.nguyen@intel.com/
Most Patches:
* wrapped line limit to 80 chars to those which don't effect readability
Patch 12:
* in skb_add_rx_frag, offset 'headlen' w.r.t page_offset when adding a
frag to avoid adding the header again
Patch 14:
* added NULL check for 'rxq' when dereferencing it in page_pool_get_stats
v4: https://lore.kernel.org/netdev/20230808003416.3805142-1-anthony.l.nguyen@intel.com/
Patch 1:
* s/virtcnl/virtchnl
* removed the kernel doc for the error code definitions that don't exist
* reworded the summary part in the virtchnl2 header
Patch 3:
* don't set local variable to NULL on error
* renamed sq_send_command_out label with err_unlock
* don't use __GFP_ZERO in dma_alloc_coherent
Patch 4:
* introduced mailbox workqueue to process mailbox interrupts
Patch 3, 4, 5, 6, 7, 8, 9, 11, 15:
* removed unnecessary variable 0-init
Patch 3, 5, 7, 8, 9, 15:
* removed defensive programming checks wherever applicable
* removed IDPF_CAP_FIELD_LAST as it can be treated as defensive
programming
Patch 3, 4, 5, 6, 7:
* replaced IDPF_DFLT_MBX_BUF_SIZE with IDPF_CTLQ_MAX_BUF_LEN
Patch 2 to 15:
* add kernel-doc for idpf.h and idpf_txrx.h enums and structures
Patch 4, 5, 15:
* adjusted the destroy sequence of the workqueues as per the alloc
sequence
Patch 4, 5, 9, 15:
* scrub unnecessary flags in 'idpf_flags'
- IDPF_REMOVE_IN_PROG flag can take care of the cases where
IDPF_REL_RES_IN_PROG is used, removed the later one
- IDPF_REQ_[TX|RX]_SPLITQ are replaced with struct variables
- IDPF_CANCEL_[SERVICE|STATS]_TASK are redundant as the work queue
doesn't get rescheduled again after 'cancel_delayed_work_sync'
- IDPF_HR_CORE_RESET is removed as there is no set_bit for this flag
- IDPF_MB_INTR_TRIGGER is removed as it is not needed anymore with the
mailbox workqueue implementation
Patch 7 to 15:
* replaced the custom buffer recycling code with page pool API
* switched the header split buffer allocations from using a bunch of
pages to using one large chunk of DMA memory
* reordered some of the flows in vport_open to support page pool
Patch 8, 12:
* don't suppress the alloc errors by using __GFP_NOWARN
Patch 9:
* removed dyn_ctl_clrpba_m as it is not being used
Patch 14:
* introduced enum idpf_vport_reset_cause instead of using vport flags
* introduced page pool stats
v3: https://lore.kernel.org/netdev/20230616231341.2885622-1-anthony.l.nguyen@intel.com/
Patch 5:
* instead of void, used 'struct virtchnl2_create_vport' type for
vport_params_recvd and vport_params_reqd and removed the typecasting
* used u16/u32 as needed instead of int for variables which cannot be
negative and updated in all the places whereever applicable
Patch 6:
* changed the commit message to "add ptypes and MAC filter support"
* used the sender Signed-off-by as the last tag on all the patches
* removed unnecessary variables 0-init
* instead of fixing the code in this commit, fixed it in the commit
where the change was introduced first
* moved get_type_info struct on to the stack instead of memory alloc
* moved mutex_lock and ptype_info memory alloc outside while loop and
adjusted the return flow
* used 'break' instead of 'continue' in ptype id switch case
v2: https://lore.kernel.org/netdev/20230614171428.1504179-1-anthony.l.nguyen@intel.com/
Patch 2:
* added "Intel(R)" to the DRV_SUMMARY and Makefile.
Patch 4, 5, 6, 15:
* replaced IDPF_VC_MSG_PENDING flag with mutex 'vc_buf_lock' for the
adapter related virtchnl opcodes.
* get the mutex lock in the virtchnl send thread itself instead of
in receive thread.
Patch 5, 6, 7, 8, 9, 11, 14, 15:
* replaced IDPF_VPORT_VC_MSG_PENDING flag with mutex 'vc_buf_lock' for
the vport related virtchnl opcodes.
* get the mutex lock in the virtchnl send thread itself instead of
in receive thread.
Patch 6:
* converted get_ptype_info logic from 1:N to 1:1 message exchange for
better handling of mutex lock.
Patch 15:
* introduced 'stats_lock' spinlock to avoid concurrent stats update.
v1: https://lore.kernel.org/netdev/20230530234501.2680230-1-anthony.l.nguyen@intel.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'Documentation/networking')
-rw-r--r-- | Documentation/networking/device_drivers/ethernet/index.rst | 1 | ||||
-rw-r--r-- | Documentation/networking/device_drivers/ethernet/intel/idpf.rst | 160 |
2 files changed, 161 insertions, 0 deletions
diff --git a/Documentation/networking/device_drivers/ethernet/index.rst b/Documentation/networking/device_drivers/ethernet/index.rst index 9827e816084b..43de285b8a92 100644 --- a/Documentation/networking/device_drivers/ethernet/index.rst +++ b/Documentation/networking/device_drivers/ethernet/index.rst @@ -32,6 +32,7 @@ Contents: intel/e1000 intel/e1000e intel/fm10k + intel/idpf intel/igb intel/igbvf intel/ixgbe diff --git a/Documentation/networking/device_drivers/ethernet/intel/idpf.rst b/Documentation/networking/device_drivers/ethernet/intel/idpf.rst new file mode 100644 index 000000000000..adb16e2abd21 --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/intel/idpf.rst @@ -0,0 +1,160 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +========================================================================== +idpf Linux* Base Driver for the Intel(R) Infrastructure Data Path Function +========================================================================== + +Intel idpf Linux driver. +Copyright(C) 2023 Intel Corporation. + +.. contents:: + +The idpf driver serves as both the Physical Function (PF) and Virtual Function +(VF) driver for the Intel(R) Infrastructure Data Path Function. + +Driver information can be obtained using ethtool, lspci, and ip. + +For questions related to hardware requirements, refer to the documentation +supplied with your Intel adapter. All hardware requirements listed apply to use +with Linux. + + +Identifying Your Adapter +======================== +For information on how to identify your adapter, and for the latest Intel +network drivers, refer to the Intel Support website: +http://www.intel.com/support + + +Additional Features and Configurations +====================================== + +ethtool +------- +The driver utilizes the ethtool interface for driver configuration and +diagnostics, as well as displaying statistical information. The latest ethtool +version is required for this functionality. If you don't have one yet, you can +obtain it at: +https://kernel.org/pub/software/network/ethtool/ + + +Viewing Link Messages +--------------------- +Link messages will not be displayed to the console if the distribution is +restricting system messages. In order to see network driver link messages on +your console, set dmesg to eight by entering the following:: + + # dmesg -n 8 + +.. note:: + This setting is not saved across reboots. + + +Jumbo Frames +------------ +Jumbo Frames support is enabled by changing the Maximum Transmission Unit (MTU) +to a value larger than the default value of 1500. + +Use the ip command to increase the MTU size. For example, enter the following +where <ethX> is the interface number:: + + # ip link set mtu 9000 dev <ethX> + # ip link set up dev <ethX> + +.. note:: + The maximum MTU setting for jumbo frames is 9706. This corresponds to the + maximum jumbo frame size of 9728 bytes. + +.. note:: + This driver will attempt to use multiple page sized buffers to receive + each jumbo packet. This should help to avoid buffer starvation issues when + allocating receive packets. + +.. note:: + Packet loss may have a greater impact on throughput when you use jumbo + frames. If you observe a drop in performance after enabling jumbo frames, + enabling flow control may mitigate the issue. + + +Performance Optimization +======================== +Driver defaults are meant to fit a wide variety of workloads, but if further +optimization is required, we recommend experimenting with the following +settings. + + +Interrupt Rate Limiting +----------------------- +This driver supports an adaptive interrupt throttle rate (ITR) mechanism that +is tuned for general workloads. The user can customize the interrupt rate +control for specific workloads, via ethtool, adjusting the number of +microseconds between interrupts. + +To set the interrupt rate manually, you must disable adaptive mode:: + + # ethtool -C <ethX> adaptive-rx off adaptive-tx off + +For lower CPU utilization: + - Disable adaptive ITR and lower Rx and Tx interrupts. The examples below + affect every queue of the specified interface. + + - Setting rx-usecs and tx-usecs to 80 will limit interrupts to about + 12,500 interrupts per second per queue:: + + # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs 80 + tx-usecs 80 + +For reduced latency: + - Disable adaptive ITR and ITR by setting rx-usecs and tx-usecs to 0 + using ethtool:: + + # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs 0 + tx-usecs 0 + +Per-queue interrupt rate settings: + - The following examples are for queues 1 and 3, but you can adjust other + queues. + + - To disable Rx adaptive ITR and set static Rx ITR to 10 microseconds or + about 100,000 interrupts/second, for queues 1 and 3:: + + # ethtool --per-queue <ethX> queue_mask 0xa --coalesce adaptive-rx off + rx-usecs 10 + + - To show the current coalesce settings for queues 1 and 3:: + + # ethtool --per-queue <ethX> queue_mask 0xa --show-coalesce + + + +Virtualized Environments +------------------------ +In addition to the other suggestions in this section, the following may be +helpful to optimize performance in VMs. + + - Using the appropriate mechanism (vcpupin) in the VM, pin the CPUs to + individual LCPUs, making sure to use a set of CPUs included in the + device's local_cpulist: /sys/class/net/<ethX>/device/local_cpulist. + + - Configure as many Rx/Tx queues in the VM as available. (See the idpf driver + documentation for the number of queues supported.) For example:: + + # ethtool -L <virt_interface> rx <max> tx <max> + + +Support +======= +For general information, go to the Intel support website at: +http://www.intel.com/support/ + +If an issue is identified with the released source code on a supported kernel +with a supported adapter, email the specific information related to the issue +to intel-wired-lan@lists.osuosl.org. + + +Trademarks +========== +Intel is a trademark or registered trademark of Intel Corporation or its +subsidiaries in the United States and/or other countries. + +* Other names and brands may be claimed as the property of others. |