summaryrefslogtreecommitdiffstats
path: root/drivers
Commit message (Collapse)AuthorAgeFilesLines
* net: dsa: felix: perform MDB migration based on ocelot->multicast listVladimir Oltean2022-05-062-46/+66
| | | | | | | | | | | | The felix driver is the only user of dsa_port_walk_mdbs(), and there isn't even a good reason for it, considering that the host MDB entries are already saved by the ocelot switch lib in the ocelot->multicast list. Rewrite the multicast entry migration procedure around the ocelot->multicast list so we can delete dsa_port_walk_mdbs(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: dsa: felix: stop migrating FDBs back and forth on tag proto changeVladimir Oltean2022-05-061-53/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | I just realized we don't need to migrate the host-filtered FDB entries when the tagging protocol changes from "ocelot" to "ocelot-8021q". Host-filtered addresses are learned towards the PGID_CPU "multicast" port group, reserved by software, which contains BIT(ocelot->num_phys_ports). That is the "special" port entry in the analyzer block for the CPU port module. In "ocelot" mode, the CPU port module's packets are redirected to the NPI port. In "ocelot-8021q" mode, felix_8021q_cpu_port_init() does something funny anyway, and changes PGID_CPU to stop pointing at the CPU port module and start pointing at the physical port where the DSA master is attached. The fact that we can alter the destination of packets learned towards PGID_CPU without altering the MAC table entries themselves means that it is pointless to walk through the FDB entries, forget that they were learned towards PGID_CPU, and re-learn them towards the "unicast" PGID associated with the physical port connected to the DSA master. We can let the PGID_CPU value change simply alter the destination of the host-filtered unicast packets in one fell swoop. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: dsa: felix: use PGID_CPU for FDB entry migration on NPI portVladimir Oltean2022-05-061-4/+2
| | | | | | | | | | | | | | | | | | | ocelot_fdb_add() redirects FDB entries installed on the NPI port towards the special reserved PGID_CPU used for host-filtered addresses. PGID_CPU contains BIT(ocelot->num_phys_ports) in the destination port mask, which is code name for the CPU port module. Whereas felix_migrate_fdbs_to_*_port() uses the ocelot->num_phys_ports PGID directly, and it appears that this works too. Even if this PGID is set to zero, apparently its number is special and packets still reach the CPU port module. Nonetheless, in the end, these addresses end up in the same place regardless of whether they go through an extra indirection layer or not. Use PGID_CPU across to have more uniformity. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* mlxbf_gige: increase MDIO polling rate to 5usDavid Thompson2022-05-061-2/+4
| | | | | | | | | | | | | | This patch increases the polling rate used by the mlxbf_gige driver on the MDIO bus. The previous polling rate was every 100us, and the new rate is every 5us. With this change the amount of time spent waiting for the MDIO BUSY signal to de-assert drops from ~100us to ~27us for each operation. Signed-off-by: David Thompson <davthompson@nvidia.com> Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com> Link: https://lore.kernel.org/r/20220505162309.20050-1-davthompson@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge branch '10GbE' of ↵Jakub Kicinski2022-05-062-4/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 10GbE Intel Wired LAN Driver Updates 2022-05-05 This series contains updates to ixgbe and igb drivers. Jeff Daly adjusts type for 'allow_unsupported_sfp' to match the associated struct value for ixgbe. Alaa Mohamed converts, deprecated, kmap() call to kmap_local_page() for igb. * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: igb: Convert kmap() to kmap_local_page() ixgbe: Fix module_param allow_unsupported_sfp type ==================== Link: https://lore.kernel.org/r/20220505155651.2606195-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * igb: Convert kmap() to kmap_local_page()Alaa Mohamed2022-05-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | kmap() is being deprecated and these usages are all local to the thread so there is no reason kmap_local_page() can't be used. Replace kmap() calls with kmap_local_page(). Signed-off-by: Alaa Mohamed <eng.alaamohamedsoliman.am@gmail.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * ixgbe: Fix module_param allow_unsupported_sfp typeJeff Daly2022-05-051-2/+2
| | | | | | | | | | | | | | | | | | The module_param allow_unsupported_sfp should be a boolean to match the type in the ixgbe_hw struct. Signed-off-by: Jeff Daly <jeffd@silicom-usa.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | net: make drivers set the TSO limit not the GSO limitJakub Kicinski2022-05-0626-38/+41
| | | | | | | | | | | | | | | | Drivers should call the TSO setting helper, GSO is controllable by user space. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: add netif_inherit_tso_max()Jakub Kicinski2022-05-065-14/+7
| | | | | | | | | | | | | | | | | | | | To make later patches smaller create a helper for inheriting the TSO limitations of a lower device. The TSO in the name is not an accident, subsequent patches will replace GSO with TSO in more names. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: flower: enable decap_v2 bitLouis Peens2022-05-061-1/+2
| | | | | | | | | | | | | | | | | | | | Finally enable the decap_v2 feature bit now that all the other bits are in place to configure it correctly. Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: flower: remove unused neighbour cacheLouis Peens2022-05-062-183/+0
| | | | | | | | | | | | | | | | | | | | | | With the neighbour entries now stored in a dedicated table there is no use to make use of the tunnel route cache anymore, so remove this. Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: flower: link pre_tun flow rules with neigh entriesLouis Peens2022-05-063-1/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add helper functions that can create links between flow rules and cached neighbour entries. Also add the relevant calls to these functions. * When a new neighbour entry gets added cycle through the saved pre_tun flow list and link any relevant matches. Update the neighbour table on the nfp with this new information. * When a new pre_tun flow rule gets added iterate through the save neighbour entries and link any relevant matches. Once again update the nfp neighbour table with any new links. * Do the inverse when deleting - remove any created links and also inform the nfp of this. Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: flower: rework tunnel neighbour configurationLouis Peens2022-05-061-59/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch updates the way in which the tunnel neighbour entries are handled. Previously they were mostly send-and-forget, with just the destination IP's cached in a list. This update changes to a scheme where the neighbour entry information is stored in a hash table. The reason for this is that the neighbour table will now also be used on the decapsulation path, whereas previously it was only used for encapsulation. We need to save more of the neighbour information in order to link them with flower flows in follow up patches. Updating of the neighbour table is now also handled by the same function, instead of separate *_write_neigh_vX functions. Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: flower: update nfp_tun_neigh structsLouis Peens2022-05-062-33/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | Prepare for more rework in following patches by updating the existing nfp_neigh_structs. The update allows for the same headers to be used for both old and new firmware, with a slight length adjustment when sending the control message to the firmware. Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: flower: fixup ipv6/ipv4 route lookup for neigh eventsLouis Peens2022-05-061-19/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a callback is received to invalidate a neighbour entry there is no need to try and populate any other flow information. Only the flowX->daddr information is needed as lookup key to delete an entry from the NFP neighbour table. Fix this by only doing the lookup if the callback is for a new entry. As part of this cleanup remove the setting of flow6.flowi6_proto, as this is not needed either, it looks to be a possible leftover from a previous implementation. Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: flower: enforce more strict pre_tun checksLouis Peens2022-05-061-7/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure that the rule also matches on source MAC address. On top of that also now save the src and dst MAC addresses similar to how vlan_tci is saved - this will be used in later comparisons with neighbour entries. Indicate if the flow matched on ipv4 or ipv6. Populate the vlan_tpid field that got added to the pre_run_rule struct as well. Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: flower: add/remove predt_list entriesLouis Peens2022-05-062-7/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add calls to add and remove flows to the predt_table. This very simply just allocates and add a new pretun entry if detected as such, and removes it when encountered on a delete flow. Compatibility for older firmware is kept in place through the DECAP_V2 feature bit. Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: flower: add infrastructure for pre_tun reworkLouis Peens2022-05-063-33/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous implementation of using a pre_tun_table for decap has some limitations, causing flows to end up unoffloaded when in fact we are able to offload them. This is because the pre_tun_table does not have enough matching resolution. The next step is to instead make use of the neighbour table which already exists for the encap direction. This patch prepares for this by: - Moving nfp_tun_neigh/_v6 to main.h. - Creating two new "wrapping" structures, one to keep track of neighbour entries (previously they were send-and-forget), and another to keep track of pre_tun flows. - Create a new list in nfp_flower_priv to keep track of pre_tunnel flows - Create a new table in nfp_flower_priv to keep track of next neighbour entries - Initialising and destroying these new list/tables - Extending nfp_fl_payload->pre_tun_rule to save more information for future use. Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch '100GbE' of ↵David S. Miller2022-05-0613-53/+150
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2022-05-05 This series contains updates to ice driver only. Wan Jiabing converts an open coded min selection to min_t(). Maciej commonizes on a single find VSI function and removes the duplicated implementation. Wojciech adjusts the return value when exceeding ICE_MAX_CHAIN_WORDS to, a more appropriate, -ENOSPC and allows for the error to be propagated. Michal adds support for ndo_get_devlink_port(). Jake does some cleanup related to virtualization code. Mainly involving function header comments and wording changes. NULL checks are added to ice_get_vf_vsi() calls in order to prevent static analysis tools from complaining that a NULL value could be dereferenced. --- v2: Dropped patch 1: "ice: Add support for classid based queue selection" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ice: remove period on argument description in ice_for_each_vfJacob Keller2022-05-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | The ice_for_each_vf macros have comments describing the implementation. One of the arguments has a period on the end, which is not our typical style. Remove the unnecessary period. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | ice: add a function comment for ice_cfg_mac_antispoofJacob Keller2022-05-051-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | This function definition was missing a comment describing its implementation. Add one. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | ice: fix wording in comment for ice_reset_vfJacob Keller2022-05-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | The comment explaining ice_reset_vf has an extraneous "the" with the "if the resets are disabled". Remove it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | ice: remove return value comment for ice_reset_all_vfsJacob Keller2022-05-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit fe99d1c06c16 ("ice: make ice_reset_all_vfs void"), the ice_reset_all_vfs function has not returned anything. The function comment still indicated it did. Fix this. While here, also add a line to clarify the function resets all VFs at once in response to hardware resets such as a PF reset. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | ice: always check VF VSI pointer valuesJacob Keller2022-05-056-7/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ice_get_vf_vsi function can return NULL in some cases, such as if handling messages during a reset where the VSI is being removed and recreated. Several places throughout the driver do not bother to check whether this VSI pointer is valid. Static analysis tools maybe report issues because they detect paths where a potentially NULL pointer could be dereferenced. Fix this by checking the return value of ice_get_vf_vsi everywhere. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | ice: add newline to dev_dbg in ice_vf_fdir_dump_infoJacob Keller2022-05-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The debug print in ice_vf_fdir_dump_info does not end in newlines. This can look confusing when reading the kernel log, as the next print will immediately continue on the same line. Fix this by adding the forgotten newline. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | ice: get switch id on switchdev devicesMichal Swiatkowski2022-05-052-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch id should be the same for each netdevice on a driver. The id must be unique between devices on the same system, but does not need to be unique between devices on different systems. The switch id is used to locate ports on a switch and to know if aggregated ports belong to the same switch. To meet this requirements, use pci_get_dsn as switch id value, as this is unique value for each devices on the same system. Implementing switch id is needed by automatic tools for kubernetes. Set switch id by setting devlink port attribiutes and calling devlink_port_attrs_set while creating pf (for uplink) and vf (for representator) devlink port. To get switch id (in switchdev mode): cat /sys/class/net/$PF0/phys_switch_id Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | ice: return ENOSPC when exceeding ICE_MAX_CHAIN_WORDSWojciech Drewek2022-05-052-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When number of words exceeds ICE_MAX_CHAIN_WORDS, -ENOSPC should be returned not -EINVAL. Do not overwrite this error code in ice_add_tc_flower_adv_fltr. Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Suggested-by: Marcin Szycik <marcin.szycik@linux.intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | ice: introduce common helper for retrieving VSI by vsi_numMaciej Fijalkowski2022-05-053-35/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both ice_idc.c and ice_virtchnl.c carry their own implementation of a helper function that is looking for a given VSI based on provided vsi_num. Their functionality is the same, so let's introduce the common function in ice.h that both of the mentioned sites will use. This is a strictly cleanup thing, no functionality is changed. Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | ice: use min_t() to make code cleaner in ice_gnssWan Jiabing2022-05-051-2/+1
| |/ | | | | | | | | | | | | | | | | Fix the following coccicheck warning: ./drivers/net/ethernet/intel/ice/ice_gnss.c:79:26-27: WARNING opportunity for min() Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | firmware: tee_bnxt: Use UUID API for exporting the UUIDAndy Shevchenko2022-05-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | There is export_uuid() function which exports uuid_t to the u8 array. Use it instead of open coding variant. This allows to hide the uuid_t internals. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220504091407.70661-1-andriy.shevchenko@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | net: move snowflake callers to netif_napi_add_tx_weight()Jakub Kicinski2022-05-053-7/+8
| | | | | | | | | | | | | | | | Make the drivers with custom tx napi weight call netif_napi_add_tx_weight(). Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20220504163725.550782-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | net: switch to netif_napi_add_tx()Jakub Kicinski2022-05-0518-38/+27
| | | | | | | | | | | | | | | | | | | | | | | | Switch net callers to the new API not requiring the NAPI_POLL_WEIGHT argument. Acked-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org> Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Link: https://lore.kernel.org/r/20220504163725.550782-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | jme: remove an unnecessary indirectionJakub Kicinski2022-05-052-3/+1
| | | | | | | | | | | | | | | | Remove a define which looks like a OS abstraction layer and makes spatch conversions on this driver problematic. Link: https://lore.kernel.org/r/20220504163939.551231-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | net: ethernet: Prepare cleanup of powerpc's asm/prom.hChristophe Leroy2022-05-059-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | powerpc's asm/prom.h includes some headers that it doesn't need itself. In order to clean powerpc's asm/prom.h up in a further step, first clean all files that include asm/prom.h Some files don't need asm/prom.h at all. For those ones, just remove inclusion of asm/prom.h Some files don't need any of the items provided by asm/prom.h, but need some of the headers included by asm/prom.h. For those ones, add the needed headers that are brought by asm/prom.h at the moment and remove asm/prom.h Some files really need asm/prom.h but also need some of the headers included by asm/prom.h. For those one, leave asm/prom.h but also add the needed headers so that they can be removed from asm/prom.h in a later step. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/r/09a13d592d628de95d30943e59b2170af5b48110.1651663857.git.christophe.leroy@csgroup.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | sungem: Prepare cleanup of powerpc's asm/prom.hChristophe Leroy2022-05-051-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | powerpc's <asm/prom.h> includes some headers that it doesn't need itself. In order to clean powerpc's <asm/prom.h> up in a further step, first clean all files that include <asm/prom.h> sungem_phy.c doesn't use any object provided by <asm/prom.h>. But removing inclusion of <asm/prom.h> leads to the following errors: CC drivers/net/sungem_phy.o drivers/net/sungem_phy.c: In function 'bcm5421_init': drivers/net/sungem_phy.c:448:42: error: implicit declaration of function 'of_get_parent'; did you mean 'dget_parent'? [-Werror=implicit-function-declaration] 448 | struct device_node *np = of_get_parent(phy->platform_data); | ^~~~~~~~~~~~~ | dget_parent drivers/net/sungem_phy.c:448:42: warning: initialization of 'struct device_node *' from 'int' makes pointer from integer without a cast [-Wint-conversion] drivers/net/sungem_phy.c:450:35: error: implicit declaration of function 'of_get_property' [-Werror=implicit-function-declaration] 450 | if (np == NULL || of_get_property(np, "no-autolowpower", NULL)) | ^~~~~~~~~~~~~~~ Remove <asm/prom.h> from included headers but add <linux/of.h> to handle the above. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/r/f7a7fab3ec5edf803d934fca04df22631c2b449d.1651662885.git.christophe.leroy@csgroup.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | Revert "Merge branch 'mlxsw-line-card-model'"Jakub Kicinski2022-05-053-321/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 5e927a9f4b9f29d78a7c7d66ea717bb5c8bbad8e, reversing changes made to cfc1d91a7d78cf9de25b043d81efcc16966d55b3. The discussion is still ongoing so let's remove the uAPI until the discussion settles. Link: https://lore.kernel.org/all/20220425090021.32e9a98f@kernel.org/ Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20220504154037.539442-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2022-05-05137-821/+1582
|\ \ | |/ |/| | | | | | | | | | | | | tools/testing/selftests/net/forwarding/Makefile f62c5acc800e ("selftests/net/forwarding: add missing tests to Makefile") 50fe062c806e ("selftests: forwarding: new test, verify host mdb entries") https://lore.kernel.org/all/20220502111539.0b7e4621@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * Merge tag 'net-5.18-rc6' of ↵Linus Torvalds2022-05-0537-214/+369
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can, rxrpc and wireguard. Previous releases - regressions: - igmp: respect RCU rules in ip_mc_source() and ip_mc_msfilter() - mld: respect RCU rules in ip6_mc_source() and ip6_mc_msfilter() - rds: acquire netns refcount on TCP sockets - rxrpc: enable IPv6 checksums on transport socket - nic: hinic: fix bug of wq out of bound access - nic: thunder: don't use pci_irq_vector() in atomic context - nic: bnxt_en: fix possible bnxt_open() failure caused by wrong RFS flag - nic: mlx5e: - lag, fix use-after-free in fib event handler - fix deadlock in sync reset flow Previous releases - always broken: - tcp: fix insufficient TCP source port randomness - can: grcan: grcan_close(): fix deadlock - nfc: reorder destructive operations in to avoid bugs Misc: - wireguard: improve selftests reliability" * tag 'net-5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (63 commits) NFC: netlink: fix sleep in atomic bug when firmware download timeout selftests: ocelot: tc_flower_chains: specify conform-exceed action for policer tcp: drop the hash_32() part from the index calculation tcp: increase source port perturb table to 2^16 tcp: dynamically allocate the perturb table used by source ports tcp: add small random increments to the source port tcp: resalt the secret every 10 seconds tcp: use different parts of the port_offset for index and offset secure_seq: use the 64 bits of the siphash for port offset calculation wireguard: selftests: set panic_on_warn=1 from cmdline wireguard: selftests: bump package deps wireguard: selftests: restore support for ccache wireguard: selftests: use newer toolchains to fill out architectures wireguard: selftests: limit parallelism to $(nproc) tests at once wireguard: selftests: make routing loop test non-fatal net/mlx5: Fix matching on inner TTC net/mlx5: Avoid double clear or set of sync reset requested net/mlx5: Fix deadlock in sync reset flow net/mlx5e: Fix trust state reset in reload net/mlx5e: Avoid checking offload capability in post_parse action ...
| | * net/mlx5: Fix matching on inner TTCMark Bloch2022-05-042-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cited commits didn't use proper matching on inner TTC as a result distribution of encapsulated packets wasn't symmetric between the physical ports. Fixes: 4c71ce50d2fe ("net/mlx5: Support partial TTC rules") Fixes: 8e25a2bc6687 ("net/mlx5: Lag, add support to create TTC tables for LAG port selection") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| | * net/mlx5: Avoid double clear or set of sync reset requestedMoshe Shemesh2022-05-041-9/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Double clear of reset requested state can lead to NULL pointer as it will try to delete the timer twice. This can happen for example on a race between abort from FW and pci error or reset. Avoid such case using test_and_clear_bit() to verify only one time reset requested state clear flow. Similarly use test_and_set_bit() to verify only one time reset requested state set flow. Fixes: 7dd6df329d4c ("net/mlx5: Handle sync reset abort event") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| | * net/mlx5: Fix deadlock in sync reset flowMoshe Shemesh2022-05-041-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sync reset flow can lead to the following deadlock when poll_sync_reset() is called by timer softirq and waiting on del_timer_sync() for the same timer. Fix that by moving the part of the flow that waits for the timer to reset_reload_work. It fixes the following kernel Trace: RIP: 0010:del_timer_sync+0x32/0x40 ... Call Trace: <IRQ> mlx5_sync_reset_clear_reset_requested+0x26/0x50 [mlx5_core] poll_sync_reset.cold+0x36/0x52 [mlx5_core] call_timer_fn+0x32/0x130 __run_timers.part.0+0x180/0x280 ? tick_sched_handle+0x33/0x60 ? tick_sched_timer+0x3d/0x80 ? ktime_get+0x3e/0xa0 run_timer_softirq+0x2a/0x50 __do_softirq+0xe1/0x2d6 ? hrtimer_interrupt+0x136/0x220 irq_exit+0xae/0xb0 smp_apic_timer_interrupt+0x7b/0x140 apic_timer_interrupt+0xf/0x20 </IRQ> Fixes: 3c5193a87b0f ("net/mlx5: Use del_timer_sync in fw reset flow of halting poll") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| | * net/mlx5e: Fix trust state reset in reloadMoshe Tal2022-05-041-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Setting dscp2prio during the driver reload can cause dcb ieee app list to be not empty after the reload finish and as a result to a conflict between the priority trust state reported by the app and the state in the device register. Reset the dcb ieee app list on initialization in case this is conflicting with the register status. Fixes: 2a5e7a1344f4 ("net/mlx5e: Add dcbnl dscp to priority support") Signed-off-by: Moshe Tal <moshet@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| | * net/mlx5e: Avoid checking offload capability in post_parse actionAriel Levkovich2022-05-041-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During TC action parsing, the can_offload callback is called before calling the action's main parsing callback. Later on, the can_offload callback is called again before handling the action's post_parse callback if exists. Since the main parsing callback might have changed and set parsing params for the rule, following can_offload checks might fail because some parsing params were already set. Specifically, the ct action main parsing sets the ct param in the parsing status structure and when the second can_offload for ct action is called, before handling the ct post parsing, it will return an error since it checks this ct param to indicate multiple ct actions which are not supported. Therefore, the can_offload call is removed from the post parsing handling to prevent such cases. This is allowed since the first can_offload call will ensure that the action can be offloaded and the fact the code reached the post parsing handling already means that the action can be offloaded. Fixes: 8300f225268b ("net/mlx5e: Create new flow attr for multi table actions") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| | * net/mlx5e: CT: Fix queued up restore put() executing after relevant ft releasePaul Blakey2022-05-041-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __mlx5_tc_ct_entry_put() queues release of tuple related to some ct FT, if that is the last reference to that tuple, the actual deletion of the tuple can happen after the FT is already destroyed and freed. Flush the used workqueue before destroying the ct FT. Fixes: a2173131526d ("net/mlx5e: CT: manage the lifetime of the ct entry object") Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| | * net/mlx5e: TC, fix decap fallback to uplink when int port not supportedAriel Levkovich2022-05-041-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When resolving the decap route device for a tunnel decap rule, the result may be an OVS internal port device. Prior to adding the support for internal port offload, such case would result in using the uplink as the default decap route device which allowed devices that can't support internal port offload to offload this decap rule. This behavior got broken by adding the internal port offload which will fail in case the device can't support internal port offload. To restore the old behavior, use the uplink device as the decap route as before when internal port offload is not supported. Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| | * net/mlx5e: TC, Fix ct_clear overwriting ct action metadataAriel Levkovich2022-05-043-16/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ct_clear action is translated to clearing reg_c metadata which holds ct state and zone information using mod header actions. These actions are allocated during the actions parsing, as part of the flow attributes main mod header action list. If ct action exists in the rule, the flow's main mod header is used only in the post action table rule, after the ct tables which set the ct info in the reg_c as part of the ct actions. Therefore, if the original rule has a ct_clear action followed by a ct action, the ct action reg_c setting will be done first and will be followed by the ct_clear resetting reg_c and overwriting the ct info. Fix this by moving the ct_clear mod header actions allocation from the ct action parsing stage to the ct action post parsing stage where it is already known if ct_clear is followed by a ct action. In such case, we skip the mod header actions allocation for the ct clear since the ct action will write to reg_c anyway after clearing it. Fixes: 806401c20a0f ("net/mlx5e: CT, Fix multiple allocations and memleak of mod acts") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| | * net/mlx5e: Lag, Don't skip fib events on current dstVlad Buslov2022-05-042-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Referenced change added check to skip updating fib when new fib instance has same or lower priority. However, new fib instance can be an update on same dst address as existing one even though the structure is another instance that has different address. Ignoring events on such instances causes multipath LAG state to not be correctly updated. Track 'dst' and 'dst_len' fields of fib event fib_entry_notifier_info structure and don't skip events that have the same value of that fields. Fixes: ad11c4f1d8fd ("net/mlx5e: Lag, Only handle events from highest priority multipath entry") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| | * net/mlx5e: Lag, Fix fib_info pointer assignmentVlad Buslov2022-05-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Referenced change incorrectly sets single path fib_info even when LAG is not active. Fix it by moving call to mlx5_lag_fib_set() into conditional that verifies LAG state. Fixes: ad11c4f1d8fd ("net/mlx5e: Lag, Only handle events from highest priority multipath entry") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| | * net/mlx5e: Lag, Fix use-after-free in fib event handlerVlad Buslov2022-05-042-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent commit that modified fib route event handler to handle events according to their priority introduced use-after-free[0] in mp->mfi pointer usage. The pointer now is not just cached in order to be compared to following fib_info instances, but is also dereferenced to obtain fib_priority. However, since mlx5 lag code doesn't hold the reference to fin_info during whole mp->mfi lifetime, it could be used after fib_info instance has already been freed be kernel infrastructure code. Don't ever dereference mp->mfi pointer. Refactor it to be 'const void*' type and cache fib_info priority in dedicated integer. Group fib_info-related data into dedicated 'fib' structure that will be further extended by following patches in the series. [0]: [ 203.588029] ================================================================== [ 203.590161] BUG: KASAN: use-after-free in mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.592386] Read of size 4 at addr ffff888144df2050 by task kworker/u20:4/138 [ 203.594766] CPU: 3 PID: 138 Comm: kworker/u20:4 Tainted: G B 5.17.0-rc7+ #6 [ 203.596751] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 203.598813] Workqueue: mlx5_lag_mp mlx5_lag_fib_update [mlx5_core] [ 203.600053] Call Trace: [ 203.600608] <TASK> [ 203.601110] dump_stack_lvl+0x48/0x5e [ 203.601860] print_address_description.constprop.0+0x1f/0x160 [ 203.602950] ? mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.604073] ? mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.605177] kasan_report.cold+0x83/0xdf [ 203.605969] ? mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.607102] mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.608199] ? mlx5_lag_init_fib_work+0x1c0/0x1c0 [mlx5_core] [ 203.609382] ? read_word_at_a_time+0xe/0x20 [ 203.610463] ? strscpy+0xa0/0x2a0 [ 203.611463] process_one_work+0x722/0x1270 [ 203.612344] worker_thread+0x540/0x11e0 [ 203.613136] ? rescuer_thread+0xd50/0xd50 [ 203.613949] kthread+0x26e/0x300 [ 203.614627] ? kthread_complete_and_exit+0x20/0x20 [ 203.615542] ret_from_fork+0x1f/0x30 [ 203.616273] </TASK> [ 203.617174] Allocated by task 3746: [ 203.617874] kasan_save_stack+0x1e/0x40 [ 203.618644] __kasan_kmalloc+0x81/0xa0 [ 203.619394] fib_create_info+0xb41/0x3c50 [ 203.620213] fib_table_insert+0x190/0x1ff0 [ 203.621020] fib_magic.isra.0+0x246/0x2e0 [ 203.621803] fib_add_ifaddr+0x19f/0x670 [ 203.622563] fib_inetaddr_event+0x13f/0x270 [ 203.623377] blocking_notifier_call_chain+0xd4/0x130 [ 203.624355] __inet_insert_ifa+0x641/0xb20 [ 203.625185] inet_rtm_newaddr+0xc3d/0x16a0 [ 203.626009] rtnetlink_rcv_msg+0x309/0x880 [ 203.626826] netlink_rcv_skb+0x11d/0x340 [ 203.627626] netlink_unicast+0x4cc/0x790 [ 203.628430] netlink_sendmsg+0x762/0xc00 [ 203.629230] sock_sendmsg+0xb2/0xe0 [ 203.629955] ____sys_sendmsg+0x58a/0x770 [ 203.630756] ___sys_sendmsg+0xd8/0x160 [ 203.631523] __sys_sendmsg+0xb7/0x140 [ 203.632294] do_syscall_64+0x35/0x80 [ 203.633045] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 203.634427] Freed by task 0: [ 203.635063] kasan_save_stack+0x1e/0x40 [ 203.635844] kasan_set_track+0x21/0x30 [ 203.636618] kasan_set_free_info+0x20/0x30 [ 203.637450] __kasan_slab_free+0xfc/0x140 [ 203.638271] kfree+0x94/0x3b0 [ 203.638903] rcu_core+0x5e4/0x1990 [ 203.639640] __do_softirq+0x1ba/0x5d3 [ 203.640828] Last potentially related work creation: [ 203.641785] kasan_save_stack+0x1e/0x40 [ 203.642571] __kasan_record_aux_stack+0x9f/0xb0 [ 203.643478] call_rcu+0x88/0x9c0 [ 203.644178] fib_release_info+0x539/0x750 [ 203.644997] fib_table_delete+0x659/0xb80 [ 203.645809] fib_magic.isra.0+0x1a3/0x2e0 [ 203.646617] fib_del_ifaddr+0x93f/0x1300 [ 203.647415] fib_inetaddr_event+0x9f/0x270 [ 203.648251] blocking_notifier_call_chain+0xd4/0x130 [ 203.649225] __inet_del_ifa+0x474/0xc10 [ 203.650016] devinet_ioctl+0x781/0x17f0 [ 203.650788] inet_ioctl+0x1ad/0x290 [ 203.651533] sock_do_ioctl+0xce/0x1c0 [ 203.652315] sock_ioctl+0x27b/0x4f0 [ 203.653058] __x64_sys_ioctl+0x124/0x190 [ 203.653850] do_syscall_64+0x35/0x80 [ 203.654608] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 203.666952] The buggy address belongs to the object at ffff888144df2000 which belongs to the cache kmalloc-256 of size 256 [ 203.669250] The buggy address is located 80 bytes inside of 256-byte region [ffff888144df2000, ffff888144df2100) [ 203.671332] The buggy address belongs to the page: [ 203.672273] page:00000000bf6c9314 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x144df0 [ 203.674009] head:00000000bf6c9314 order:2 compound_mapcount:0 compound_pincount:0 [ 203.675422] flags: 0x2ffff800010200(slab|head|node=0|zone=2|lastcpupid=0x1ffff) [ 203.676819] raw: 002ffff800010200 0000000000000000 dead000000000122 ffff888100042b40 [ 203.678384] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 [ 203.679928] page dumped because: kasan: bad access detected [ 203.681455] Memory state around the buggy address: [ 203.682421] ffff888144df1f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 203.683863] ffff888144df1f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 203.685310] >ffff888144df2000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 203.686701] ^ [ 203.687820] ffff888144df2080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 203.689226] ffff888144df2100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 203.690620] ================================================================== Fixes: ad11c4f1d8fd ("net/mlx5e: Lag, Only handle events from highest priority multipath entry") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
| | * net/mlx5e: Fix the calling of update_buffer_lossy() APIMark Zhang2022-05-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | The arguments of update_buffer_lossy() is in a wrong order. Fix it. Fixes: 88b3d5c90e96 ("net/mlx5e: Fix port buffers cell size value") Signed-off-by: Mark Zhang <markzhang@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>