linux-stable.git - Linux kernel stable tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	can: netns: remove "can_" prefix from members struct netns_can	Marc Kleine-Budde	2019-09-04	3	-27/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch improves the code reability by removing the redundant "can_" prefix from the members of struct netns_can (as the struct netns_can itself is the member "can" of the struct net.) The conversion is done with: sed -i \ -e "s/struct can_dev_rcv_lists \can_rx_alldev_list;/struct can_dev_rcv_lists rx_alldev_list;/" \ -e "s/spinlock_t can_rcvlists_lock;/spinlock_t rcvlists_lock;/" \ -e "s/struct timer_list can_stattimer;/struct timer_list stattimer; /" \ -e "s/can\.can_rx_alldev_list/can.rx_alldev_list/g" \ -e "s/can\.can_rcvlists_lock/can.rcvlists_lock/g" \ -e "s/can\.can_stattimer/can.stattimer/g" \ include/net/netns/can.h \ net/can/*.[ch] Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
*	can: proc: give variables holding CAN statistics a sensible name	Marc Kleine-Budde	2019-09-04	1	-58/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch rename the variables holding the CAN statistics (can_stats and can_pstats) to pkg_stats and rcv_lists_stats which reflect better their meaning. The conversion is done with: sed -i \ -e "s/can_stats$[^_]$/pkg_stats\1/g" \ -e "s/can_pstats/rcv_lists_stats/g" \ net/can/proc.c Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
*	can: af_can: give variables holding CAN statistics a sensible name	Marc Kleine-Budde	2019-09-04	1	-15/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch rename the variables holding the CAN statistics (can_stats and can_pstats) to pkg_stats and rcv_lists_stats which reflect better their meaning. The conversion is done with: sed -i \ -e "s/can_stats$[^_]$/pkg_stats\1/g" \ -e "s/can_pstats/rcv_lists_stats/g" \ net/can/af_can.c Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
*	can: netns: give members of struct netns_can holding the statistics a ↵	Marc Kleine-Budde	2019-09-04	3	-25/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sensible name This patch gives the members of the struct netns_can that are holding the statistics a sensible name, by renaming struct netns_can::can_stats into struct netns_can::pkg_stats and struct netns_can::can_pstats into struct netns_can::rcv_lists_stats. The conversion is done with: sed -i \ -e "s:$struct[^]\$can_stats;.:\1pkg_stats;:" \ -e "s:$struct[^]\$can_pstats;.:\1rcv_lists_stats;:" \ -e "s/can\.can_stats/can.pkg_stats/g" \ -e "s/can\.can_pstats/can.rcv_lists_stats/g" \ net/can/*.[ch] \ include/net/netns/can.h Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
*	can: netns: give structs holding the CAN statistics a sensible name	Marc Kleine-Budde	2019-09-04	4	-18/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch renames both "struct s_stats" and "struct s_pstats", to "struct can_pkg_stats" and "struct can_rcv_lists_stats" to better reflect their meaning and improve code readability. The conversion is done with: sed -i \ -e "s/struct s_stats/struct can_pkg_stats/g" \ -e "s/struct s_pstats/struct can_rcv_lists_stats/g" \ net/can/*.[ch] \ include/net/netns/can.h Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
*	Merge branch '100GbE' of ↵	David S. Miller	2019-09-03	9	-74/+144
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2019-09-03 This series contains updates to ice driver only. Anirudh adds the ability for the driver to handle EMP resets correctly by adding the logic to the existing ice_reset_subtask(). Jeb fixes up the logic to properly free up the resources for a switch rule whether or not it was successful in the removal. Brett fixes up the reporting of ITR values to let the user know odd ITR values are not allowed. Fixes the driver to only disable VLAN pruning on VLAN deletion when the VLAN being deleted is the last VLAN on the VF VSI. Chinh updates the driver to determine the TSA value from the priority value when in CEE mode. Bruce aligns the driver with the hardware specification by ensuring that a PF reset is done as part of the unload logic. Also update the driver unloading field, based on the latest hardware specification, which allows us to remove an unnecessary endian conversion. Moves #defines based on their need in the code. Jesse adds the current state of auto-negotiation in the link up message. In addition, adds additional information to inform the user of an issue with the topology/configuration of the link. Usha updates the driver to allow the maximum TCs that the firmware supports, rather than hard coding to a set value. Dave updates the DCB initialization flow to handle the case of an actual error during DCB init. Updated the driver to report the current stats, even when the netdev is down, which aligns with our other drivers. Mitch fixes the VF reset code flows to ensure that it properly calls ice_dis_vsi_txq() to notify the firmware that the VF is being reset. Michal fixes the driver so the DCB is not enabled when the SW LLDP is activated, which was causing a communication issue with other NICs. The problem lies in that DCB was being enabled without checking the number of TCs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ice: Only disable VLAN pruning for the VF when all VLANs are removed	Brett Creeley	2019-09-03	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently if the VF adds a VLAN, VLAN pruning will be enabled for that VSI. Also, when a VLAN gets deleted it will disable VLAN pruning even if other VLAN(s) exists for the VF. Fix this by only disabling VLAN pruning on the VF VSI when removing the last VF (i.e. vf->num_vlan == 0). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: Remove enable DCB when SW LLDP is activated	Michal Swiatkowski	2019-09-03	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove code that enables DCB in initialization when SW LLDP is activated. DCB flag is set or reset before in ice_init_pf_dcb based on number of TCs. So there is not need to overwrite it. Setting DCB without checking number of TCs can cause communication problems with other cards. Host card sends packet with VLAN priority tag, but client card doesn't strip this tag and ping doesn't work. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: Report stats when VSI is down	Dave Ertman	2019-09-03	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is currently a check in get_ndo_stats that returns before updating stats if the VSI is down or there are no Tx or Rx queues. This causes the netdev to report zero stats with the netdev is down. Remove the check so that the behavior of reporting stats is the same as it was in IXGBE. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: Always notify FW of VF reset	Mitch Williams	2019-09-03	1	-9/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The call to ice_dis_vsi_txq() acts as the notification to the firmware that the VF is being reset. Because of this, we need to make this call every time we reset, regardless of whatever else we do to stop the Tx queues. Without this change, VF resets would fail to complete on interfaces that were up and running. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: Correctly handle return values for init DCB	Dave Ertman	2019-09-03	1	-9/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the init path for DCB, the call to ice_init_dcb() can return a non-zero value for either an actual error, or due to the FW lldp engine being stopped. We are currently treating all non-zero values only as an indication that the FW LLDP engine is stopped. Check for an actual error in the DCB init flow. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: Limit Max TCs on devices with more than 4 ports	Usha Ketineni	2019-09-03	4	-2/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch limits the max TCs set by the driver to the value provided by the firmware as per the capabilities of the device. Otherwise, hard coding to 8 TC max would fail the device configurations with more than 4 ports. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: Cleanup defines in ice_type.h	Tony Nguyen	2019-09-03	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conventionally, if the #defines/other are not needed by other header files being included, #includes are done first followed by #defines and other stuff. Move the #defines before the #includes to follow this convention. Suggested by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: print extra message if topology issue	Jesse Brandeburg	2019-09-03	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The driver needs to inform the user if there is an issue with the topology / configuration of the link. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: add print of autoneg state to link message	Jesse Brandeburg	2019-09-03	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Print the state of auto-negotiation when printing the Link up message. Adds new text to the "NIC Link is up" line like Autoneg: <True \| False> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: update driver unloading field for Queue Shutdown AQ command	Bruce Allan	2019-09-03	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to recent specification versions, the field in the Queue Shutdown AdminQ command consisting of the "driver unloading" indication is not a 4 byte field (it is byte.bit 16.0). Change it to a byte and remove the unnecessary endian conversion. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: add needed PFR during driver unload	Bruce Allan	2019-09-03	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to the specification, a PF Reset must be done as part of the driver unload flow. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: Deduce TSA value from the priority value in the CEE mode	Chinh T Cao	2019-09-03	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In CEE mode, the TSA information can be derived from the reported priority value. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: Report what the user set for coalesce [tx\|rx]-usecs	Brett Creeley	2019-09-03	1	-44/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently if the user sets an odd value for [tx\|rx]-usecs we align the value because the hardware only understands ITR values in multiples of 2. This seems misleading because we are essentially telling the user that the ITR value is odd, when in fact we have changed it internally. Fix this by reporting that setting odd ITR values is not allowed. Also, while making changes to ice_set_rc_coalesce() I noticed a bit of code/error duplication. Make the necessary changes to remove the duplication. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: Fix resource leak in ice_remove_rule_internal()	Jeb Cramer	2019-09-03	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't free s_rule if ice_aq_sw_rules() returns a non-zero status. If it returned a zero status, s_rule would be freed right after, so this implies it should be freed within the scope of the function regardless. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ice: Fix EMP reset handling	Anirudh Venkataramanan	2019-09-03	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ice_reset_subtask needs to handle EMP resets as well, as EMP resets can be triggered by the firmware. This patch adds the logic to do this. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* \|	Merge tag 'mlx5-updates-2019-09-01-v2' of ↵	David S. Miller	2019-09-03	44	-371/+12464
\|\ \ \| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2019-09-01 (Software steering support) Abstract: -------- Mellanox ConnetX devices supports packet matching, packet modification and redirection. These functionalities are also referred to as flow-steering. To configure a steering rule, the rule is written to the device owned memory, this memory is accessed and cached by the device when processing a packet. Steering rules are constructed from multiple steering entries (STE). Rules are configured using the Firmware command interface. The Firmware processes the given driver command and translates them to STEs, then writes them to the device memory in the current steering tables. This process is slow due to the architecture of the command interface and the processing complexity of each rule. The highlight of this patchset is to cut the middle man (The firmware) and do steering rules programming into device directly from the driver, with no firmware intervention whatsoever. Motivation: ----------- Software (driver managed) steering allows for high rule insertion rates compared to the FW steering described above, this is achieved by using internal RDMA writes to the device owned memory instead of the slow command interface to program steering rules. Software (driver managed) steering, doesn't depend on new FW for new steering functionality, new implementations can be done in the driver skipping the FW layer. Performance: ------------ The insertion rate on a single core using the new approach allows programming ~300K rules per sec. (Done via direct raw test to the new mlx5 sw steering layer, without any kernel layer involved). Test: TC L2 rules 33K/s with Software steering (this patchset). 5K/s with FW and current driver. This will improve OVS based solution performance. Architecture and implementation details: ---------------------------------------- Software steering will be dynamically selected via devlink device parameter. Example: $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode pci/0000:06:00.0: name flow_steering_mode type driver-specific values: cmode runtime value smfs mlx5 software steering module a.k.a (DR - Direct Rule) is implemented and contained in mlx5/core/steering directory and controlled by MLX5_SW_STEERING kconfig flag. mlx5 core steering layer (fs_core) already provides a shim layer for implementing different steering mechanisms, software steering will leverage that as seen at the end of this series. When Software Steering for a specific steering domain (NIC/RDMA/Vport/ESwitch, etc ..) is supported, it will cause rules targeting this domain to be created using SW steering instead of FW. The implementation includes: Domain - The steering domain is the object that all other object resides in. It holds the memory allocator, send engine, locks and other shared data needed by lower objects such as table, matcher, rule, action. Each domain can contain multiple tables. Domain is equivalent to namespaces e.g (NIC/RDMA/Vport/ESwitch, etc ..) as implemented currently in mlx5_core fs_core (flow steering core). Table - Table objects are used for holding multiple matchers, each table has a level used to prevent processing loops. Packets are being directed to this table once it is set as the root table, this is done by fs_core using a FW command. A packet is being processed inside the table matcher by matcher until a successful hit, otherwise the packet will perform the default action. Matcher - Matchers objects are used to specify the fields mask for matching when processing a packet. A matcher belongs to a table, each matcher can hold multiple rules, each rule with different matching values corresponding to the matcher mask. Each matcher has a priority used for rule processing order inside the table. Action - Action objects are created to specify different steering actions such as count, reformat (encapsulate, decapsulate, ...), modify header, forward to table and many other actions. When creating a rule a sequence of actions can be provided to be executed on a successful match. Rule - Rule objects are used to specify a specific match on packets as well as the actions that should be executed. A rule belongs to a matcher. STE - This layer is used to hold the specific STE format for the device and to convert the requested rule to STEs. Each rule is constructed of an STE chain, Multiple rules construct a steering graph. Each node in the graph is a hash table containing multiple STEs. The index of each STE in the hash table is being calculated using a CRC32 hash function. Memory pool - Used for managing and caching device owned memory for rule insertion. The memory is being allocated using DM (device memory) API. Communication with device - layer for standard RDMA operation using RC QP to configure the device steering. Command utility - This module holds all of the FW commands that are required for SW steering to function. Patch planning and files: ------------------------- 1) First patch, adds the support to Add flow steering actions to fs_cmd shim layer. 2) Next 12 patch will add a file per each Software steering functionality/module as described above. (See patches with title: DR, *) 3) Add CONFIG_MLX5_SW_STEERING for software steering support and enable build with the new files 4) Next two patches will add the support for software steering in mlx5 steering shim layer net/mlx5: Add API to set the namespace steering mode net/mlx5: Add direct rule fs_cmd implementation 5) Last two patches will add the new devlink parameter to select mlx5 steering mode, will be valid only for switchdev mode for now. Two modes are supported: 1. DMFS - Device managed flow steering 2. SMFS - Software/Driver managed flow steering. In the DMFS mode, the HW steering entities are created through the FW. In the SMFS mode this entities are created though the driver directly. The driver will use the devlink steering mode only if the steering domain supports it, for now SMFS will manages only the switchdev eswitch steering domain. User command examples: - Set SMFS flow steering mode:: $ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime - Read device flow steering mode:: $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode pci/0000:06:00.0: name flow_steering_mode type driver-specific values: cmode runtime value smfs ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/mlx5: Add devlink flow_steering_mode parameter	Maor Gottlieb	2019-09-03	2	-1/+144
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new parameter (flow_steering_mode) to control the flow steering mode of the driver. Two modes are supported: 1. DMFS - Device managed flow steering 2. SMFS - Software/Driver managed flow steering. In the DMFS mode, the HW steering entities are created through the FW. In the SMFS mode this entities are created though the driver directly. The driver will use the devlink steering mode only if the steering domain supports it, for now SMFS will manages only the switchdev eswitch steering domain. User command examples: - Set SMFS flow steering mode:: $ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime - Read device flow steering mode:: $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode pci/0000:06:00.0: name flow_steering_mode type driver-specific values: cmode runtime value smfs Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: Add support to use SMFS in switchdev mode	Maor Gottlieb	2019-09-03	2	-7/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case that flow steering mode of the driver is SMFS (Software Managed Flow Steering), then use the DR (SW steering) API to create the steering objects. In addition, add a call to the set peer namespace when switchdev gets devcom pair event. It is required to support VF LAG in SMFS. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: Add API to set the namespace steering mode	Maor Gottlieb	2019-09-03	2	-1/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add API to set the flow steering root namesapce mode. Setting new mode should be called before any steering operation is executed on the namespace. This API is going to be used by steering users such switchdev. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: Add direct rule fs_cmd implementation	Maor Gottlieb	2019-09-03	7	-6/+717
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support to create flow steering objects via direct rule API (SW steering). New layer is added - fs_dr, this layer translates the command that fs_core sends to the FW into direct rule API. In case that direct rule is not supported in some feature then -EOPNOTSUPP is returned. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: DR, Add CONFIG_MLX5_SW_STEERING for software steering support	Alex Vesker	2019-09-03	3	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new mlx5 Kconfig flag to allow selecting software steering support and compile all the steering files only if the flag is selected. Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: DR, Expose APIs for direct rule managing	Alex Vesker	2019-09-03	1	-0/+212
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Expose APIs for direct rule managing to increase insertion rate by bypassing the firmware. Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: DR, Add required FW steering functionality	Alex Vesker	2019-09-03	1	-0/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SW steering is capable of doing many steering functionalities but there are still some functionalities which are not exposed to upper layers and therefore performed by the FW. This is the support for recalculating checksum using a hairpin QP. The recalculation is required after a modify TTL action which skips the needed CS calculation in HW. Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: DR, Expose steering rule functionality	Alex Vesker	2019-09-03	1	-0/+1243
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rules are the actual objects that tie matchers, header values and actions. Each rule belongs to a matcher, which can hold multiple rules sharing the same mask. Each rule is a specific set of values and actions. When a packet reaches a matcher it is being matched against the matcher`s rules. In case of a match over a rule its actions will be executed. Each rule object contains a set of STEs, where each STE is a definition of match values and actions defined by the rule. This file handles the rule operations and processing. Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: DR, Expose steering action functionality	Alex Vesker	2019-09-03	1	-0/+1588
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On rule creation a set of actions can be provided, the actions describe what to do with the packet in case of a match. It is possible to provide a set of actions which will be done by order. Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: DR, Expose steering matcher functionality	Alex Vesker	2019-09-03	1	-0/+770
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Matcher defines which packets fields are matched when a packet arrives. Matcher is a part of a table and can contain one or more rules. Where rule defines specific values of the matcher's mask definition. Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: DR, Expose steering table functionality	Alex Vesker	2019-09-03	1	-0/+294
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tables are objects which are used for storing matchers, each table belongs to a domain and defined by the domain type. When a packet reaches the table it is being processed by each of its matchers until a successful match. Tables can hold multiple matchers ordered by matcher priority. Each table has a level. Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: DR, Expose steering domain functionality	Alex Vesker	2019-09-03	1	-0/+395
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Domain is the frame for all of the dr (direct rule) objects. There are different domain types which also affect the object under that domain. Each domain can hold multiple tables which can hold multiple matchers and so on, this means that all of the dr (direct rule) objects exist under a specific domain. The domain object also holds the resources needed for other objects such as memory management and communication with the device. Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: DR, Add Steering entry (STE) utilities	Alex Vesker	2019-09-03	2	-0/+2406
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Steering Entry (STE) object is the basic building block of the steering map. There are several types of STEs. Each rule can be constructed of multiple STEs. Each STE dictates which fields of the packet's header are being matched as well as the information about the next step in map (hit and miss pointers). The hardware gets a packet and tries to match it against the STEs, going to either the hit pointer or the miss pointer. This file handles the STE operations. Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: DR, Expose an internal API to issue RDMA operations	Alex Vesker	2019-09-03	1	-0/+976
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Inserting or deleting a rule is done by RDMA read/write operation to SW ICM device memory. This file provides the support for executing these operations. It includes allocating the needed resources and providing an API for writing steering entries to the memory. Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: DR, ICM pool memory allocator	Alex Vesker	2019-09-03	1	-0/+570
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ICM device memory is used for writing steering rules (STEs) to the NIC. An ICM memory pool allocator was implemented to manage the required memory. The pool consists of buckets, a bucket per chunk size. Once a bucket is empty we will cut a row of memory from the latest allocated MR, if the MR size is not sufficient we will allocate a new MR. HW design requires that chunks memory address should be aligned to the chunk size, this is the reason for managing the MR with row size that insures memory alignment. Current design is greedy in memory but provides quick allocation times in steady state. Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: DR, Add direct rule command utilities	Alex Vesker	2019-09-03	2	-0/+1084
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add direct rule command utilities which consists of all the FW commands that are executed to provide the SW steering functionality. Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: DR, Add the internal direct rule types definitions	Alex Vesker	2019-09-03	1	-0/+1060
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the internal header file that contains various types definition that will be used in coming patches as well as the internal functions decelerations. Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	net/mlx5: Add flow steering actions to fs_cmd shim layer	Maor Gottlieb	2019-09-03	13	-110/+289
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add flow steering actions: modify header and packet reformat to the fs_cmd shim layer. This allows each namespace to define possibly different functionality for alloc/dealloc action commands. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| *	Merge branch 'mlx5-next' of ↵	Saeed Mahameed	2019-09-02	14	-251/+497
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Merge mlx5-next patches needed for upcoming mlx5 software steering. 1) Alex adds HW bits and definitions required for SW steering 2) Ariel moves device memory management to mlx5_core (From mlx5_ib) 3) Maor, Cleanups and fixups for eswitch mode and RoCE 4) Mark, Set only stag for match untagged packets Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| \| *	net/mlx5: Set only stag for match untagged packets	Mark Bloch	2019-09-01	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cvlan_tag enabled in match criteria and disabled in match value means both S & C tags don't exist (untagged of both). Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| \| *	net/mlx5: Add stub for mlx5_eswitch_mode	Maor Gottlieb	2019-09-01	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Return MLX5_ESWITCH_NONE when CONFIG_MLX5_ESWITCH is not selected. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| \| *	net/mlx5: Avoid disabling RoCE when uninitialized	Maor Gottlieb	2019-09-01	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the check if RoCE steering is initialized to the disable RoCE function, it will ensure that we disable RoCE only if we succeeded in enabling it before. Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| \| *	net/mlx5: Add HW bits and definitions required for SW steering	Alex Vesker	2019-09-01	2	-37/+205
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the required Software Steering hardware definitions and bits to mlx5_ifc. Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Yevgeny Klitenik <kliten@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
\| \| *	net/mlx5: Move device memory management to mlx5_core	Ariel Levkovich	2019-09-01	9	-209/+276
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the device memory allocation and deallocation commands SW ICM memory to mlx5_core to expose this API for all mlx5_core users. This comes as preparation for supporting SW steering in kernel where it will be required to allocate and register device memory for direct rule insertion. In addition, an API to register this device memory for future remote access operations is introduced using the create_mkey commands. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* \| \|	Merge branch 'mvpp2-per-cpu-buffers'	David S. Miller	2019-09-02	2	-41/+237
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Matteo Croce says: ==================== mvpp2: per-cpu buffers This patchset workarounds an PP2 HW limitation which prevents to use per-cpu rx buffers. The first patch is just a refactor to prepare for the second one. The second one allocates percpu buffers if the following conditions are met: - CPU number is less or equal 4 - no port is using jumbo frames If the following conditions are not met at load time, of jumbo frame is enabled later on, the shared allocation is reverted. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	mvpp2: percpu buffers	Matteo Croce	2019-09-02	2	-23/+222
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Every mvpp2 unit can use up to 8 buffers mapped by the BM (the HW buffer manager). The HW will place the frames in the buffer pool depending on the frame size: short (< 128 bytes), long (< 1664) or jumbo (up to 9856). As any unit can have up to 4 ports, the driver allocates only 2 pools, one for small and one long frames, and share them between ports. When the first port MTU is set higher than 1664 bytes, a third pool is allocated for jumbo frames. This shared allocation makes impossible to use percpu allocators, and creates contention between HW queues. If possible, i.e. if the number of possible CPU are less than 8 and jumbo frames are not used, switch to a new scheme: allocate 8 per-cpu pools for short and long frames and bind every pool to an RXQ. When the first port MTU is set higher than 1664 bytes, the allocation scheme is reverted to the old behaviour (3 shared pools), and when all ports MTU are lowered, the per-cpu buffers are allocated again. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	mvpp2: refactor BM pool functions	Matteo Croce	2019-09-02	1	-19/+16
\|/ / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Refactor mvpp2_bm_pool_create(), mvpp2_bm_pool_destroy() and mvpp2_bm_pools_init() so that they accept a struct device instead of a struct platform_device, as they just need platform_device->dev. Removing such dependency makes the BM code more reusable in context where we don't have a pointer to the platform_device. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	net: dsa: Fix off-by-one number of calls to devlink_port_unregister	Vladimir Oltean	2019-09-02	1	-10/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a function such as dsa_slave_create fails, currently the following stack trace can be seen: [ 2.038342] sja1105 spi0.1: Probed switch chip: SJA1105T [ 2.054556] sja1105 spi0.1: Reset switch and programmed static config [ 2.063837] sja1105 spi0.1: Enabled switch tagging [ 2.068706] fsl-gianfar soc:ethernet@2d90000 eth2: error -19 setting up slave phy [ 2.076371] ------------[ cut here ]------------ [ 2.080973] WARNING: CPU: 1 PID: 21 at net/core/devlink.c:6184 devlink_free+0x1b4/0x1c0 [ 2.088954] Modules linked in: [ 2.092005] CPU: 1 PID: 21 Comm: kworker/1:1 Not tainted 5.3.0-rc6-01360-g41b52e38d2b6-dirty #1746 [ 2.100912] Hardware name: Freescale LS1021A [ 2.105162] Workqueue: events deferred_probe_work_func [ 2.110287] [<c03133a4>] (unwind_backtrace) from [<c030d8cc>] (show_stack+0x10/0x14) [ 2.117992] [<c030d8cc>] (show_stack) from [<c10b08d8>] (dump_stack+0xb4/0xc8) [ 2.125180] [<c10b08d8>] (dump_stack) from [<c0349d04>] (__warn+0xe0/0xf8) [ 2.132018] [<c0349d04>] (__warn) from [<c0349e34>] (warn_slowpath_null+0x40/0x48) [ 2.139549] [<c0349e34>] (warn_slowpath_null) from [<c0f19d74>] (devlink_free+0x1b4/0x1c0) [ 2.147772] [<c0f19d74>] (devlink_free) from [<c1064fc0>] (dsa_switch_teardown+0x60/0x6c) [ 2.155907] [<c1064fc0>] (dsa_switch_teardown) from [<c1065950>] (dsa_register_switch+0x8e4/0xaa8) [ 2.164821] [<c1065950>] (dsa_register_switch) from [<c0ba7fe4>] (sja1105_probe+0x21c/0x2ec) [ 2.173216] [<c0ba7fe4>] (sja1105_probe) from [<c0b35948>] (spi_drv_probe+0x80/0xa4) [ 2.180920] [<c0b35948>] (spi_drv_probe) from [<c0a4c1cc>] (really_probe+0x108/0x400) [ 2.188711] [<c0a4c1cc>] (really_probe) from [<c0a4c694>] (driver_probe_device+0x78/0x1bc) [ 2.196933] [<c0a4c694>] (driver_probe_device) from [<c0a4a3dc>] (bus_for_each_drv+0x58/0xb8) [ 2.205414] [<c0a4a3dc>] (bus_for_each_drv) from [<c0a4c024>] (__device_attach+0xd0/0x168) [ 2.213637] [<c0a4c024>] (__device_attach) from [<c0a4b1d0>] (bus_probe_device+0x84/0x8c) [ 2.221772] [<c0a4b1d0>] (bus_probe_device) from [<c0a4b72c>] (deferred_probe_work_func+0x84/0xc4) [ 2.230686] [<c0a4b72c>] (deferred_probe_work_func) from [<c03650a4>] (process_one_work+0x218/0x510) [ 2.239772] [<c03650a4>] (process_one_work) from [<c03660d8>] (worker_thread+0x2a8/0x5c0) [ 2.247908] [<c03660d8>] (worker_thread) from [<c036b348>] (kthread+0x148/0x150) [ 2.255265] [<c036b348>] (kthread) from [<c03010e8>] (ret_from_fork+0x14/0x2c) [ 2.262444] Exception stack(0xea965fb0 to 0xea965ff8) [ 2.267466] 5fa0: 00000000 00000000 00000000 00000000 [ 2.275598] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 2.283729] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 2.290333] ---[ end trace ca5d506728a0581a ]--- devlink_free is complaining right here: WARN_ON(!list_empty(&devlink->port_list)); This happens because devlink_port_unregister is no longer done right away in dsa_port_setup when a DSA_PORT_TYPE_USER has failed. Vivien said about this change that: Also no need to call devlink_port_unregister from within dsa_port_setup as this step is inconditionally handled by dsa_port_teardown on error. which is not really true. The devlink_port_unregister function _is_ being called unconditionally from within dsa_port_setup, but not for this port that just failed, just for the previous ones which were set up. ports_teardown: for (i = 0; i < port; i++) dsa_port_teardown(&ds->ports[i]); Initially I was tempted to fix this by extending the "for" loop to also cover the port that failed during setup. But this could have potentially unforeseen consequences unrelated to devlink_port or even other types of ports than user ports, which I can't really test for. For example, if for some reason devlink_port_register itself would fail, then unconditionally unregistering it in dsa_port_teardown would not be a smart idea. The list might go on. So just make dsa_port_setup undo the setup it had done upon failure, and let the for loop undo the work of setting up the previous ports, which are guaranteed to be brought up to a consistent state. Fixes: 955222ca5281 ("net: dsa: use a single switch statement for port setup") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>