summaryrefslogtreecommitdiffstats
path: root/include/linux/ethtool.h
Commit message (Collapse)AuthorAgeFilesLines
* ethtool: add interface to read Tx hardware timestamping statisticsRahul Rameshbabu2024-04-051-1/+26
| | | | | | | | | | | | | Multiple network devices that support hardware timestamping appear to have common behavior with regards to timestamp handling. Implement common Tx hardware timestamping statistics in a tx_stats struct_group. Common Rx hardware timestamping statistics can subsequently be implemented in a rx_stats struct_group for ethtool_ts_stats. Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Link: https://lore.kernel.org/r/20240403212931.128541-2-rrameshbabu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: ethtool: eee: Remove legacy _u32 from keeeAndrew Lunn2024-02-281-3/+0
| | | | | | | | | | All MAC drivers have been converted to use the link mode members of keee. So remove the _u32 values, and the code in the ethtool core to convert the legacy _u32 values to link modes. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: add linkmode bitmap support to struct ethtool_keeeHeiner Kallweit2024-01-311-0/+3
| | | | | | | | | | | Add linkmode bitmap members to struct ethtool_keee, but keep the legacy u32 bitmaps for compatibility with existing drivers. Use linkmode "supported" not being empty as indicator that a user wants to use the linkmode bitmap members instead of the legacy bitmaps. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: add suffix _u32 to legacy bitmap members of struct ethtool_keeeHeiner Kallweit2024-01-311-3/+3
| | | | | | | | | | This is in preparation of using the existing names for linkmode bitmaps. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: adjust struct ethtool_keee to kernel needsHeiner Kallweit2024-01-311-5/+3
| | | | | | | | | | | | | | | | This patch changes the following in struct ethtool_keee - remove member cmd, it's not needed on kernel side - remove reserved fields - switch the semantically boolean members to type bool We don't have to change any user of the boolean members due to the implicit casting from/to bool. A small change is needed where a pointer to bool members is used, in addition remove few now unneeded double negations. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: replace struct ethtool_eee with a new struct ethtool_keee on kernel ↵Heiner Kallweit2024-01-311-2/+14
| | | | | | | | | | | | | | | | | side In order to pass EEE link modes beyond bit 32 to userspace we have to complement the 32 bit bitmaps in struct ethtool_eee with linkmode bitmaps. Therefore, similar to ethtool_link_settings and ethtool_link_ksettings, add a struct ethtool_keee. In a first step it's an identical copy of ethtool_eee. This patch simply does a s/ethtool_eee/ethtool_keee/g for all users. No functional change intended. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: reformat kerneldoc for struct ethtool_fec_statsJonathan Corbet2023-12-291-2/+4
| | | | | | | | | | | | | | | | The kerneldoc comment for struct ethtool_fec_stats attempts to describe the "total" and "lanes" fields of the ethtool_fec_stat substructure in a way leading to these warnings: ./include/linux/ethtool.h:424: warning: Excess struct member 'lane' description in 'ethtool_fec_stats' ./include/linux/ethtool.h:424: warning: Excess struct member 'total' description in 'ethtool_fec_stats' Reformat the comment to retain the information while eliminating the warnings. Signed-off-by: Jonathan Corbet <corbet@lwn.net> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ethtool: add support for symmetric-xor RSS hashAhmed Zaki2023-12-131-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Symmetric RSS hash functions are beneficial in applications that monitor both Tx and Rx packets of the same flow (IDS, software firewalls, ..etc). Getting all traffic of the same flow on the same RX queue results in higher CPU cache efficiency. A NIC that supports "symmetric-xor" can achieve this RSS hash symmetry by XORing the source and destination fields and pass the values to the RSS hash algorithm. The user may request RSS hash symmetry for a specific algorithm, via: # ethtool -X eth0 hfunc <hash_alg> symmetric-xor or turn symmetry off (asymmetric) by: # ethtool -X eth0 hfunc <hash_alg> The specific fields for each flow type should then be specified as usual via: # ethtool -N|-U eth0 rx-flow-hash <flow_type> s|d|f|n Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Link: https://lore.kernel.org/r/20231213003321.605376-4-ahmed.zaki@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: ethtool: get rid of get/set_rxfh_context functionsAhmed Zaki2023-12-131-15/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the RSS context parameters to struct ethtool_rxfh_param and use the get/set_rxfh to handle the RSS contexts as well. This is part 2/2 of the fix suggested in [1]: - Add a rss_context member to the argument struct and a capability like cap_link_lanes_supported to indicate whether driver supports rss contexts, then you can remove *et_rxfh_context functions, and instead call *et_rxfh() with a non-zero rss_context. Link: https://lore.kernel.org/netdev/20231121152906.2dd5f487@kernel.org/ [1] CC: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Tony Nguyen <anthony.l.nguyen@intel.com> CC: Marcin Wojtas <mw@semihalf.com> CC: Russell King <linux@armlinux.org.uk> CC: Sunil Goutham <sgoutham@marvell.com> CC: Geetha sowjanya <gakula@marvell.com> CC: Subbaraya Sundeep <sbhatta@marvell.com> CC: hariprasad <hkelam@marvell.com> CC: Saeed Mahameed <saeedm@nvidia.com> CC: Leon Romanovsky <leon@kernel.org> CC: Edward Cree <ecree.xilinx@gmail.com> CC: Martin Habets <habetsm.xilinx@gmail.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Link: https://lore.kernel.org/r/20231213003321.605376-3-ahmed.zaki@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: ethtool: pass a pointer to parameters to get/set_rxfh ethtool opsAhmed Zaki2023-12-131-8/+30
| | | | | | | | | | | | | | | | | | | | | | | | The get/set_rxfh ethtool ops currently takes the rxfh (RSS) parameters as direct function arguments. This will force us to change the API (and all drivers' functions) every time some new parameters are added. This is part 1/2 of the fix, as suggested in [1]: - First simplify the code by always providing a pointer to all params (indir, key and func); the fact that some of them may be NULL seems like a weird historic thing or a premature optimization. It will simplify the drivers if all pointers are always present. - Then make the functions take a dev pointer, and a pointer to a single struct wrapping all arguments. The set_* should also take an extack. Link: https://lore.kernel.org/netdev/20231121152906.2dd5f487@kernel.org/ [1] Suggested-by: Jakub Kicinski <kuba@kernel.org> Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Link: https://lore.kernel.org/r/20231213003321.605376-2-ahmed.zaki@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* ethtool: add SET for TCP_DATA_SPLIT ringparamAlexander Lobakin2023-12-131-0/+2
| | | | | | | | | | | | | | | | | Follow up commit 9690ae604290 ("ethtool: add header/data split indication") and add the set part of Ethtool's header split, i.e. ability to enable/disable header split via the Ethtool Netlink interface. This might be helpful to optimize the setup for particular workloads, for example, to avoid XDP frags, and so on. A driver should advertise ``ETHTOOL_RING_USE_TCP_DATA_SPLIT`` in its ops->supported_ring_params to allow doing that. "Unknown" passed from the userspace when the header split is supported means the driver is free to choose the preferred state. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20231212142752.935000-2-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* ethtool: Implement ethtool_puts()justinstitt@google.com2023-12-081-0/+13
| | | | | | | | | | | | | Use strscpy() to implement ethtool_puts(). Functionally the same as ethtool_sprintf() when it's used with two arguments or with just "%s" format specifier. Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Madhuri Sripada <madhuri.sripada@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ethtool: Refactor identical get_ts_info implementations.Richard Cochran2023-11-181-0/+8
| | | | | | | | | | | | The vlan, macvlan and the bonding drivers call their "real" device driver in order to report the time stamping capabilities. Provide a core ethtool helper function to avoid copy/paste in the stack. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ethtool: Fix documentation of ethtool_sprintf()Andrew Lunn2023-11-021-2/+2
| | | | | | | | | | | | | | This function takes a pointer to a pointer, unlike sprintf() which is passed a plain pointer. Fix up the documentation to make this clear. Fixes: 7888fe53b706 ("ethtool: Add common function for filling out strings") Cc: Alexander Duyck <alexanderduyck@fb.com> Cc: Justin Stitt <justinstitt@google.com> Cc: stable@vger.kernel.org Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Justin Stitt <justinstitt@google.com> Link: https://lore.kernel.org/r/20231028192511.100001-1-andrew@lunn.ch Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* ethtool: untangle the linkmode and ethtool headersJakub Kicinski2023-10-201-20/+2
| | | | | | | | | | | | | | | | | Commit 26c5334d344d ("ethtool: Add forced speed to supported link modes maps") added a dependency between ethtool.h and linkmode.h. The dependency in the opposite direction already exists so the new code was inserted in an awkward place. The reason for ethtool.h to include linkmode.h, is that ethtool_forced_speed_maps_init() is a static inline helper. That's not really necessary. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Paul Greenwalt <paul.greenwalt@intel.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: Add forced speed to supported link modes mapsPaul Greenwalt2023-10-181-0/+37
| | | | | | | | | | | | | | | | | | | | | | The need to map Ethtool forced speeds to Ethtool supported link modes is common among drivers. To support this, add a common structure for forced speed maps and a function to init them. This is solution was originally introduced in commit 1d4e4ecccb11 ("qede: populate supported link modes maps on module init") for qede driver. ethtool_forced_speed_maps_init() should be called during driver init with an array of struct ethtool_forced_speed_map to populate the mapping. Definitions for maps themselves are left in the driver code, as the sets of supported link modes may vary between the devices. Suggested-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: add software tx timestamping supportHangbin Liu2023-04-181-0/+1
| | | | | | | | | | | | | | | | | | | | Currently, bonding only obtain the timestamp (ts) information of the active slave, which is available only for modes 1, 5, and 6. For other modes, bonding only has software rx timestamping support. However, some users who use modes such as LACP also want tx timestamp support. To address this issue, let's check the ts information of each slave. If all slaves support tx timestamping, we can enable tx timestamping support for the bond. Add a note that the get_ts_info may be called with RCU, or rtnl or reference on the device in ethtool.h> Suggested-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Link: https://lore.kernel.org/r/20230418034841.2566262-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* ethtool: Add support for configuring tx_push_buf_lenShay Agroskin2023-03-271-4/+10
| | | | | | | | | | | | | This attribute, which is part of ethtool's ring param configuration allows the user to specify the maximum number of the packet's payload that can be written directly to the device. Example usage: # ethtool -G [interface] tx-push-buf-len [number of bytes] Co-developed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: ethtool: extend ringparam set/get APIs for rx_pushShannon Nelson2023-02-131-0/+4
| | | | | | | | Similar to what was done for TX_PUSH, add an RX_PUSH concept to the ethtool interfaces. Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ethtool: provide shims for stats aggregation helpers when ↵Vladimir Oltean2023-01-261-11/+0
| | | | | | | | | | | | | | | | | | | | CONFIG_ETHTOOL_NETLINK=n ethtool_aggregate_*_stats() are implemented in net/ethtool/stats.c, a file which is compiled out when CONFIG_ETHTOOL_NETLINK=n. In order to avoid adding Kbuild dependencies from drivers (which call these helpers) on CONFIG_ETHTOOL_NETLINK, let's add some shim definitions which simply make the helpers dead code. This means the function prototypes should have been located in include/linux/ethtool_netlink.h rather than include/linux/ethtool.h. Fixes: 449c5459641a ("net: ethtool: add helpers for aggregate statistics") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230125110214.4127759-1-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* net: ethtool: add helpers for MM fragment size translationVladimir Oltean2023-01-231-0/+42
| | | | | | | | | | | | | | | | We deliberately make the Linux UAPI pass the minimum fragment size in octets, even though IEEE 802.3 defines it as discrete values, and addFragSize is just the multiplier. This is because there is nothing impossible in operating with an in-between value for the fragment size of non-final preempted fragments, and there may even appear hardware which supports the in-between sizes. For the hardware which just understands the addFragSize multiplier, create two helpers which translate back and forth the values passed in octets. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ethtool: add helpers for aggregate statisticsVladimir Oltean2023-01-231-40/+60
| | | | | | | | | | | | | | | | | | | | When a pMAC exists but the driver is unable to atomically query the aggregate eMAC+pMAC statistics, the user should be given back at least the sum of eMAC and pMAC counters queried separately. This is a generic problem, so add helpers in ethtool to do this operation, if the driver doesn't have a better way to report aggregate stats. Do this in a way that does not require changes to these functions when new stats are added (basically treat the structures as an array of u64 values, except for the first element which is the stats source). In include/linux/ethtool.h, there is already a section where helper function prototypes should be placed. The trouble is, this section is too early, before the definitions of struct ethtool_eth_mac_stats et.al. Move that section at the end and append these new helpers to it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ethtool: netlink: retrieve stats from multiple sources (eMAC, pMAC)Vladimir Oltean2023-01-231-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | IEEE 802.3-2018 clause 99 defines a MAC Merge sublayer which contains an Express MAC and a Preemptible MAC. Both MACs are hidden to higher and lower layers and visible as a single MAC (packet classification to eMAC or pMAC on TX is done based on priority; classification on RX is done based on SFD). For devices which support a MAC Merge sublayer, it is desirable to retrieve individual packet counters from the eMAC and the pMAC, as well as aggregate statistics (their sum). Introduce a new ETHTOOL_A_STATS_SRC attribute which is part of the policy of ETHTOOL_MSG_STATS_GET and, and an ETHTOOL_A_PAUSE_STATS_SRC which is part of the policy of ETHTOOL_MSG_PAUSE_GET (accepted when ETHTOOL_FLAG_STATS is set in the common ethtool header). Both of these take values from enum ethtool_mac_stats_src, defaulting to "aggregate" in the absence of the attribute. Existing drivers do not need to pay attention to this enum which was added to all driver-facing structures, just the ones which report the MAC merge layer as supported. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ethtool: add support for MAC Merge layerVladimir Oltean2023-01-231-0/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The MAC merge sublayer (IEEE 802.3-2018 clause 99) is one of 2 specifications (the other being Frame Preemption; IEEE 802.1Q-2018 clause 6.7.2), which work together to minimize latency caused by frame interference at TX. The overall goal of TSN is for normal traffic and traffic with a bounded deadline to be able to cohabitate on the same L2 network and not bother each other too much. The standards achieve this (partly) by introducing the concept of preemptible traffic, i.e. Ethernet frames that have a custom value for the Start-of-Frame-Delimiter (SFD), and these frames can be fragmented and reassembled at L2 on a link-local basis. The non-preemptible frames are called express traffic, they are transmitted using a normal SFD, and they can preempt preemptible frames, therefore having lower latency, which can matter at lower (100 Mbps) link speeds, or at high MTUs (jumbo frames around 9K). Preemption is not recursive, i.e. a P frame cannot preempt another P frame. Preemption also does not depend upon priority, or otherwise said, an E frame with prio 0 will still preempt a P frame with prio 7. In terms of implementation, the standards talk about the presence of an express MAC (eMAC) which handles express traffic, and a preemptible MAC (pMAC) which handles preemptible traffic, and these MACs are multiplexed on the same MII by a MAC merge layer. To support frame preemption, the definition of the SFD was generalized to SMD (Start-of-mPacket-Delimiter), where an mPacket is essentially an Ethernet frame fragment, or a complete frame. Stations unaware of an SMD value different from the standard SFD will treat P frames as error frames. To prevent that from happening, a negotiation process is defined. On RX, packets are dispatched to the eMAC or pMAC after being filtered by their SMD. On TX, the eMAC/pMAC classification decision is taken by the 802.1Q spec, based on packet priority (each of the 8 user priority values may have an admin-status of preemptible or express). The MAC Merge layer and the Frame Preemption parameters have some degree of independence in terms of how software stacks are supposed to deal with them. The activation of the MM layer is supposed to be controlled by an LLDP daemon (after it has been communicated that the link partner also supports it), after which a (hardware-based or not) verification handshake takes place, before actually enabling the feature. So the process is intended to be relatively plug-and-play. Whereas FP settings are supposed to be coordinated across a network using something approximating NETCONF. The support contained here is exclusively for the 802.3 (MAC Merge) portions and not for the 802.1Q (Frame Preemption) parts. This API is sufficient for an LLDP daemon to do its job. The FP adminStatus variable from 802.1Q is outside the scope of an LLDP daemon. I have taken a few creative licenses and augmented the Linux kernel UAPI compared to the standard managed objects recommended by IEEE 802.3. These are: - ETHTOOL_A_MM_PMAC_ENABLED: According to Figure 99-6: Receive Processing state diagram, a MAC Merge layer is always supposed to be able to receive P frames. However, this implies keeping the pMAC powered on, which will consume needless power in applications where FP will never be used. If LLDP is used, the reception of an Additional Ethernet Capabilities TLV from the link partner is sufficient indication that the pMAC should be enabled. So my proposal is that in Linux, we keep the pMAC turned off by default and that user space turns it on when needed. - ETHTOOL_A_MM_VERIFY_ENABLED: The IEEE managed object is called aMACMergeVerifyDisableTx. I opted for consistency (positive logic) in the boolean netlink attributes offered, so this is also positive here. Other than the meaning being reversed, they correspond to the same thing. - ETHTOOL_A_MM_MAX_VERIFY_TIME: I found it most reasonable for a LLDP daemon to maximize the verifyTime variable (delay between SMD-V transmissions), to maximize its chances that the LP replies. IEEE says that the verifyTime can range between 1 and 128 ms, but the NXP ENETC stupidly keeps this variable in a 7 bit register, so the maximum supported value is 127 ms. I could have chosen to hardcode this in the LLDP daemon to a lower value, but why not let the kernel expose its supported range directly. - ETHTOOL_A_MM_TX_MIN_FRAG_SIZE: the standard managed object is called aMACMergeAddFragSize, and expresses the "additional" fragment size (on top of ETH_ZLEN), whereas this expresses the absolute value of the fragment size. - ETHTOOL_A_MM_RX_MIN_FRAG_SIZE: there doesn't appear to exist a managed object mandated by the standard, but user space clearly needs to know what is the minimum supported fragment size of our local receiver, since LLDP must advertise a value no lower than that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: add tx aggregation parametersDaniele Palmas2023-01-131-1/+11
| | | | | | | | | | | | | | | | | Add the following ethtool tx aggregation parameters: ETHTOOL_A_COALESCE_TX_AGGR_MAX_BYTES Maximum size in bytes of a tx aggregated block of frames. ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES Maximum number of frames that can be aggregated into a block. ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS Time in usecs after the first packet arrival in an aggregated block for the block to be sent. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ethtool: add netlink interface for the PLCA RSPiergiorgio Beruto2023-01-111-0/+12
| | | | | | | | | | | Add support for configuring the PLCA Reconciliation Sublayer on multi-drop PHYs that support IEEE802.3cg-2019 Clause 148 (e.g., 10BASE-T1S). This patch adds the appropriate netlink interface to ethtool. Signed-off-by: Piergiorgio Beruto <piergiorgio.beruto@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: doc: clarify what drivers can implement in their get_drvinfo()Vincent Mailhol2022-11-171-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many of the drivers which implement ethtool_ops::get_drvinfo() will prints the .driver, .version or .bus_info of struct ethtool_drvinfo. To have a glance of current state, do: $ git grep -W "get_drvinfo(struct" Printing in those three fields is useless because: - since [1], the driver version should be the kernel version (at least for upstream drivers). Arguably, out of tree drivers might still want to set a custom version, but out of tree is not our focus. - since [2], the core is able to provide default values for .driver and .bus_info. In summary, drivers may provide .fw_version and .erom_version, the rest is expected to be done by the core. In struct ethtool_ops doc from linux/ethtool: rephrase field get_drvinfo() doc to discourage developers from implementing this callback. In struct ethtool_drvinfo doc from uapi/linux/ethtool.h: remove the paragraph mentioning what drivers should do. Rationale: no need to repeat what is already written in struct ethtool_ops doc. But add a note that .fw_version and .erom_version are driver defined. Also update the dummy driver and simply remove the callback in order not to confuse the newcomers: most of the drivers will not need this callback function any more. [1] commit 6a7e25c7fb48 ("net/core: Replace driver version to be kernel version") Link: https://git.kernel.org/torvalds/linux/c/6a7e25c7fb48 [2] commit edaf5df22cb8 ("ethtool: ethtool_get_drvinfo: populate drvinfo fields even if callback exits") Link: https://git.kernel.org/netdev/net-next/c/edaf5df22cb8 Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/r/20221116171828.4093-1-mailhol.vincent@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* ethtool: linkstate: add a statistic for PHY down eventsJakub Kicinski2022-11-081-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | The previous attempt to augment carrier_down (see Link) was not met with much enthusiasm so let's do the simple thing of exposing what some devices already maintain. Add a common ethtool statistic for link going down. Currently users have to maintain per-driver mapping to extract the right stat from the vendor-specific ethtool -S stats. carrier_down does not fit the bill because it counts a lot of software related false positives. Add the statistic to the extended link state API to steer vendors towards implementing all of it. Implement for bnxt and all Linux-controlled PHYs. mlx5 and (possibly) enic also have a counter for this but I leave the implementation to their maintainers. Link: https://lore.kernel.org/r/20220520004500.2250674-1-kuba@kernel.org Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20221104190125.684910-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* net: ethtool: extend ringparam set/get APIs for tx_pushJie Wang2022-04-151-0/+4
| | | | | | | | | | Currently tx push is a standard driver feature which controls use of a fast path descriptor push. So this patch extends the ringparam APIs and data structures to support set/get tx push by ethtool -G/g. Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* ethtool: add support to set/get completion queue event sizeSubbaraya Sundeep2022-02-231-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Add support to set completion queue event size via ethtool -G parameter and get it via ethtool -g parameter. ~ # ./ethtool -G eth0 cqe-size 512 ~ # ./ethtool -g eth0 Ring parameters for eth0: Pre-set maximums: RX: 1048576 RX Mini: n/a RX Jumbo: n/a TX: 1048576 Current hardware settings: RX: 256 RX Mini: n/a RX Jumbo: n/a TX: 4096 RX Buf Len: 2048 CQE Size: 128 Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* ethtool: add header/data split indicationJakub Kicinski2022-01-281-0/+2
| | | | | | | | | | | | | | | | | | | For applications running on a mix of platforms it's useful to have a clear indication whether host's NIC supports the geometry requirements of TCP zero-copy. TCP zero-copy Rx requires data to be neatly placed into memory pages. Most NICs can't do that. This patch is adding GET support only, since the NICs I work with either always have the feature enabled or enable it whenever MTU is set to jumbo. In other words I don't need SET. But adding set should be trivial. (The only note on SET is that we will likely want the setting to be "sticky" and use 0 / `unknown` to reset it back to driver default.) Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: Fix link extended state for big endianMoshe Tal2022-01-201-1/+1
| | | | | | | | | | | | | | | The link extended sub-states are assigned as enum that is an integer size but read from a union as u8, this is working for small values on little endian systems but for big endian this always give 0. Fix the variable in the union to match the enum size. Fixes: ecc31c60240b ("ethtool: Add link extended state") Signed-off-by: Moshe Tal <moshet@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: extend ringparam setting/getting API with rx_buf_lenHao Chen2021-11-221-2/+6
| | | | | | | | | | Add two new parameters kernel_ringparam and extack for .get_ringparam and .set_ringparam to extend more ring params through netlink. Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: add support to set/get rx buf len via ethtoolHao Chen2021-11-221-0/+18
| | | | | | | | | Add support to set rx buf len via ethtool -G parameter and get rx buf len via ethtool -g parameter. Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: Add transceiver module extended stateIdo Schimmel2021-10-061-0/+1
| | | | | | | | | | | | | | | Add an extended state and sub-state to describe link issues related to transceiver modules. The 'ETHTOOL_LINK_EXT_SUBSTATE_MODULE_CMIS_NOT_READY' extended sub-state tells user space that port is unable to gain a carrier because the CMIS Module State Machine did not reach the ModuleReady (Fully Operational) state. For example, if the module is stuck at ModuleLowPwr or ModuleFault state. In case of the latter, user space can read the fault reason from the module's EEPROM and potentially reset it. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* ethtool: Add ability to control transceiver modules' power modeIdo Schimmel2021-10-061-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and 'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver modules parameters and retrieve their status. The first parameter to control is the power mode of the module. It is only relevant for paged memory modules, as flat memory modules always operate in low power mode. When a paged memory module is in low power mode, its power consumption is reduced to the minimum, the management interface towards the host is available and the data path is deactivated. User space can choose to put modules that are not currently in use in low power mode and transition them to high power mode before putting the associated ports administratively up. This is useful for user space that favors reduced power consumption and lower temperatures over reduced link up times. In QSFP-DD modules the transition from low power mode to high power mode can take a few seconds and this transition is only expected to get longer with future / more complex modules. User space can control the power mode of the module via the power mode policy attribute ('ETHTOOL_A_MODULE_POWER_MODE_POLICY'). Possible values: * high: Module is always in high power mode. * auto: Module is transitioned by the host to high power mode when the first port using it is put administratively up and to low power mode when the last port using it is put administratively down. The operational power mode of the module is available to user space via the 'ETHTOOL_A_MODULE_POWER_MODE' attribute. The attribute is not reported to user space when a module is not plugged-in. The user API is designed to be generic enough so that it could be used for modules with different memory maps (e.g., SFF-8636, CMIS). The only implementation of the device driver API in this series is for a MAC driver (mlxsw) where the module is controlled by the device's firmware, but it is designed to be generic enough so that it could also be used by implementations where the module is controlled by the CPU. CMIS testing ============ # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x03 (ModuleReady) LowPwrAllowRequestHW : Off LowPwrRequestSW : Off The module is not in low power mode, as it is not forced by hardware (LowPwrAllowRequestHW is off) or by software (LowPwrRequestSW is off). The power mode can be queried from the kernel. In case LowPwrAllowRequestHW was on, the kernel would need to take into account the state of the LowPwrRequestHW signal, which is not visible to user space. $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy high power-mode high Change the power mode policy to 'auto': # ethtool --set-module swp11 power-mode-policy auto Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x01 (ModuleLowPwr) LowPwrAllowRequestHW : Off LowPwrRequestSW : On Put the associated port administratively up which will instruct the host to transition the module to high power mode: # ip link set dev swp11 up Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode high Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x03 (ModuleReady) LowPwrAllowRequestHW : Off LowPwrRequestSW : Off Put the associated port administratively down which will instruct the host to transition the module to low power mode: # ip link set dev swp11 down Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x01 (ModuleLowPwr) LowPwrAllowRequestHW : Off LowPwrRequestSW : On SFF-8636 testing ================ # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled Power set : Off Power override : On ... Transmit avg optical power (Channel 1) : 0.7733 mW / -1.12 dBm Transmit avg optical power (Channel 2) : 0.7649 mW / -1.16 dBm Transmit avg optical power (Channel 3) : 0.7790 mW / -1.08 dBm Transmit avg optical power (Channel 4) : 0.7837 mW / -1.06 dBm Rcvr signal avg optical power(Channel 1) : 0.9302 mW / -0.31 dBm Rcvr signal avg optical power(Channel 2) : 0.9079 mW / -0.42 dBm Rcvr signal avg optical power(Channel 3) : 0.8993 mW / -0.46 dBm Rcvr signal avg optical power(Channel 4) : 0.8778 mW / -0.57 dBm The module is not in low power mode, as it is not forced by hardware (Power override is on) or by software (Power set is off). The power mode can be queried from the kernel. In case Power override was off, the kernel would need to take into account the state of the LPMode signal, which is not visible to user space. $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy high power-mode high Change the power mode policy to 'auto': # ethtool --set-module swp13 power-mode-policy auto Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled Power set : On Power override : On ... Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm Put the associated port administratively up which will instruct the host to transition the module to high power mode: # ip link set dev swp13 up Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode high Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled Power set : Off Power override : On ... Transmit avg optical power (Channel 1) : 0.7934 mW / -1.01 dBm Transmit avg optical power (Channel 2) : 0.7859 mW / -1.05 dBm Transmit avg optical power (Channel 3) : 0.7885 mW / -1.03 dBm Transmit avg optical power (Channel 4) : 0.7985 mW / -0.98 dBm Rcvr signal avg optical power(Channel 1) : 0.9325 mW / -0.30 dBm Rcvr signal avg optical power(Channel 2) : 0.9034 mW / -0.44 dBm Rcvr signal avg optical power(Channel 3) : 0.9086 mW / -0.42 dBm Rcvr signal avg optical power(Channel 4) : 0.8885 mW / -0.51 dBm Put the associated port administratively down which will instruct the host to transition the module to low power mode: # ip link set dev swp13 down Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled Power set : On Power override : On ... Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* ethtool: extend coalesce setting uAPI with CQE modeYufeng Mo2021-08-241-2/+9
| | | | | | | | | | | In order to support more coalesce parameters through netlink, add two new parameter kernel_coal and extack for .set_coalesce and .get_coalesce, then some extra info can return to user with the netlink API. Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* ethtool: add two coalesce attributes for CQE modeYufeng Mo2021-08-241-1/+10
| | | | | | | | | | | | | | | | Currently, there are many drivers who support CQE mode configuration, some configure it as a fixed when initialized, some provide an interface to change it by ethtool private flags. In order to make it more generic, add two new 'ETHTOOL_A_COALESCE_USE_CQE_TX' and 'ETHTOOL_A_COALESCE_USE_CQE_RX' coalesce attributes, then these parameters can be accessed by ethtool netlink coalesce uAPI. Also add an new structure kernel_ethtool_coalesce, then the new parameter can be added into this struct. Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* ethtool: improve compat ioctl handlingArnd Bergmann2021-07-231-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ethtool compat ioctl handling is hidden away in net/socket.c, which introduces a couple of minor oddities: - The implementation may end up diverging, as seen in the RXNFC extension in commit 84a1d9c48200 ("net: ethtool: extend RXNFC API to support RSS spreading of filter matches") that does not work in compat mode. - Most architectures do not need the compat handling at all because u64 and compat_u64 have the same alignment. - On x86, the conversion is done for both x32 and i386 user space, but it's actually wrong to do it for x32 and cannot work there. - On 32-bit Arm, it never worked for compat oabi user space, since that needs to do the same conversion but does not. - It would be nice to get rid of both compat_alloc_user_space() and copy_in_user() throughout the kernel. None of these actually seems to be a serious problem that real users are likely to encounter, but fixing all of them actually leads to code that is both shorter and more readable. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: add a new command for getting PHC virtual clocksYangbo Lu2021-07-011-0/+10
| | | | | | | | | Add an interface for getting PHC (PTP Hardware Clock) virtual clocks, which are based on PHC physical clock providing hardware timestamp to network packets. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: Use kernel data types for internal EEPROM structIdo Schimmel2021-06-221-6/+6
| | | | | | | | | | The struct is not visible to user space and therefore should not use the user visible data types. Instead, use internal data types like other structures in the file. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: add interface to read RMON statsJakub Kicinski2021-04-161-0/+43
| | | | | | | | | | | | | | | | | | | | Most devices maintain RMON (RFC 2819) stats - particularly the "histogram" of packets received by size. Unlike other RFCs which duplicate IEEE stats, the short/oversized frame counters in RMON don't seem to match IEEE stats 1-to-1 either, so expose those, too. Do not expose basic packet, CRC errors etc - those are already otherwise covered. Because standard defines packet ranges only up to 1518, and everything above that should theoretically be "oversized" - devices often create their own ranges. Going beyond what the RFC defines - expose the "histogram" in the Tx direction (assume for now that the ranges will be the same). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: add interface to read standard MAC Ctrl statsJakub Kicinski2021-04-161-0/+12
| | | | | | | | Number of devices maintains the standard-based MAC control counters for control frames. Add a API for those. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: add interface to read standard MAC statsJakub Kicinski2021-04-161-0/+31
| | | | | | | | | | Most of the MAC statistics are included in struct rtnl_link_stats64, but some fields are aggregated. Besides it's good to expose these clearly hardware stats separately. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: add a new command for reading standard statsJakub Kicinski2021-04-161-0/+10
| | | | | | | | | | | | | | | | | Add an interface for reading standard stats, including stats which don't have a corresponding control interface. Start with IEEE 802.3 PHY stats. There seems to be only one stat to expose there. Define API to not require user space changes when new stats or groups are added. Groups are based on bitset, stats have a string set associated. v1: wrap stats in a nest Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: add FEC statisticsJakub Kicinski2021-04-151-0/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Similarly to pause statistics add stats for FEC. The IEEE standard mandates two sets of counters: - 30.5.1.1.17 aFECCorrectedBlocks - 30.5.1.1.18 aFECUncorrectableBlocks where block is a block of bits FEC operates on. Each of these counters is defined per lane (PCS instance). Multiple vendors provide number of corrected _bits_ rather than/as well as blocks. This set adds the 2 standard-based block counters and a extra one for corrected bits. Counters are exposed to user space via netlink in new attributes. Each attribute carries an array of u64s, first element is the total count, and the following ones are a per-lane break down. Much like with pause stats the operation will not fail when driver does not implement the get_fec_stats callback (nor can the driver fail the operation by returning an error). If stats can't be reported the relevant attributes will be empty. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: move ethtool_stats_initJakub Kicinski2021-04-151-0/+6
| | | | | | | We'll need it for FEC stats as well. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: Allow network drivers to dump arbitrary EEPROM dataVladyslav Tarasiuk2021-04-111-1/+32
| | | | | | | | | | | | | | Define get_module_eeprom_by_page() ethtool callback and implement netlink infrastructure. get_module_eeprom_by_page() allows network drivers to dump a part of module's EEPROM specified by page and bank numbers along with offset and length. It is effectively a netlink replacement for get_module_info() and get_module_eeprom() pair, which is needed due to emergence of complex non-linear EEPROM layouts. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2021-04-091-6/+17
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: MAINTAINERS - keep Chandrasekar drivers/net/ethernet/mellanox/mlx5/core/en_main.c - simple fix + trust the code re-added to param.c in -next is fine include/linux/bpf.h - trivial include/linux/ethtool.h - trivial, fix kdoc while at it include/linux/skmsg.h - move to relevant place in tcp.c, comment re-wrapped net/core/skmsg.c - add the sk = sk // sk = NULL around calls net/tipc/crypto.c - trivial Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * ethtool: Remove link_mode param and derive link params from driverDanielle Ratson2021-04-071-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some drivers clear the 'ethtool_link_ksettings' struct in their get_link_ksettings() callback, before populating it with actual values. Such drivers will set the new 'link_mode' field to zero, resulting in user space receiving wrong link mode information given that zero is a valid value for the field. Another problem is that some drivers (notably tun) can report random values in the 'link_mode' field. This can result in a general protection fault when the field is used as an index to the 'link_mode_params' array [1]. This happens because such drivers implement their set_link_ksettings() callback by simply overwriting their private copy of 'ethtool_link_ksettings' struct with the one they get from the stack, which is not always properly initialized. Fix these problems by removing 'link_mode' from 'ethtool_link_ksettings' and instead have drivers call ethtool_params_from_link_mode() with the current link mode. The function will derive the link parameters (e.g., speed) from the link mode and fill them in the 'ethtool_link_ksettings' struct. v3: * Remove link_mode parameter and derive the link parameters in the driver instead of passing link_mode parameter to ethtool and derive it there. v2: * Introduce 'cap_link_mode_supported' instead of adding a validity field to 'ethtool_link_ksettings' struct. [1] general protection fault, probably for non-canonical address 0xdffffc00f14cc32c: 0000 [#1] PREEMPT SMP KASAN KASAN: probably user-memory-access in range [0x000000078a661960-0x000000078a661967] CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446 Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 +38 d0 7c 08 84 d2 0f 85 b9 RSP: 0018:ffffc900019df7a0 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888026136008 RCX: 0000000000000000 RDX: 00000000f14cc32c RSI: ffffffff873439ca RDI: 000000078a661960 RBP: 00000000ffff8880 R08: 00000000ffffffff R09: ffff88802613606f R10: ffffffff873439bc R11: 0000000000000000 R12: 0000000000000000 R13: ffff88802613606c R14: ffff888011d0c210 R15: ffff888011d0c210 FS: 0000000000749300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004b60f0 CR3: 00000000185c2000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37 ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586 ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656 ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620 dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842 dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440 sock_do_ioctl+0x148/0x2d0 net/socket.c:1060 sock_ioctl+0x477/0x6a0 net/socket.c:1177 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: c8907043c6ac9 ("ethtool: Get link mode in use instead of speed and duplex parameters") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>