summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* dt-bindings: net: dsa: microchip: Update ksz device tree bindings for drive ↵Oleksij Rempel2023-09-161-0/+20
| | | | | | | | | | | | | | | | | | | strength Extend device tree bindings to support drive strength configuration for the ksz* switches. Introduced properties: - microchip,hi-drive-strength-microamp: Controls the drive strength for high-speed interfaces like GMII/RGMII and more. - microchip,lo-drive-strength-microamp: Governs the drive strength for low-speed interfaces such as LEDs, PME_N, and others. - microchip,io-drive-strength-microamp: Controls the drive strength for for undocumented Pads on KSZ88xx variants. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch '200GbE' of ↵David S. Miller2023-09-1626-0/+19191
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Introduce Intel IDPF driver Pavan Kumar Linga says: This patch series introduces the Intel Infrastructure Data Path Function (IDPF) driver. It is used for both physical and virtual functions. Except for some of the device operations the rest of the functionality is the same for both PF and VF. IDPF uses virtchnl version2 opcodes and structures defined in the virtchnl2 header file which helps the driver to learn the capabilities and register offsets from the device Control Plane (CP) instead of assuming the default values. The format of the series follows the driver init flow to interface open. To start with, probe gets called and kicks off the driver initialization by spawning the 'vc_event_task' work queue which in turn calls the 'hard reset' function. As part of that, the mailbox is initialized which is used to send/receive the virtchnl messages to/from the CP. Once that is done, 'core init' kicks in which requests all the required global resources from the CP and spawns the 'init_task' work queue to create the vports. Based on the capability information received, the driver creates the said number of vports (one or many) where each vport is associated to a netdev. Also, each vport has its own resources such as queues, vectors etc. From there, rest of the netdev_ops and data path are added. IDPF implements both single queue which is traditional queueing model as well as split queue model. In split queue model, it uses separate queue for both completion descriptors and buffers which helps to implement out-of-order completions. It also helps to implement asymmetric queues, for example multiple RX completion queues can be processed by a single RX buffer queue and multiple TX buffer queues can be processed by a single TX completion queue. In single queue model, same queue is used for both descriptor completions as well as buffer completions. It also supports features such as generic checksum offload, generic receive offload (hardware GRO) etc. --- v7: Patch 2: * removed pci_[disable|enable]_pcie_error_reporting as they are dropped from the core Patch 4, 9: * used 'kasprintf' instead of 'snprintf' to avoid providing explicit character string size which also fixes "-Wformat-truncation" warnings Patch 14: * used 'ethtool_sprintf' instead of 'snprintf' to avoid providing explicit character string size which also fixes "-Wformat-truncation" warning * add string format argument to the 'ethtool_sprintf' to avoid warning on "-Wformat-security" v6: https://lore.kernel.org/netdev/20230825235954.894050-1-pavan.kumar.linga@intel.com/ Note: 'Acked-by' was only added to patches 1, 2, 12 and not to the other patches because of the changes in v6 Patch 3, 4, 5, 6, 7, 8, 9, 11, 13, 14, 15: * renamed 'reset_lock' to 'vport_ctrl_lock' to reflect the lock usage * to avoid defensive programming, used 'vport_ctrl_lock' for the user callbacks that access the 'vport' to prevent the hardware reset thread from releasing the 'vport', when the user callback is in progress * added some variables to netdev private structure to avoid vport access if possible from ethtool and ndo callbacks * moved 'mac_filter_list_lock' and MAC related flags to vport_config structure and refactored mac filter flow to handle asynchronous ndo mac filter callbacks * stop the queues before starting the reset flow to avoid TX hangs * removed 'sw_mutex' and 'stop_mutex' as they are not needed anymore * added missing clear bit in 'init_task' error path * renamed labels appropriately Patch 8: * replaced page_pool_put_page with page_pool_put_full_page * for the page pool max_len, used PAGE_SIZE Patch 10, 11, 13: * made use of the 'netif_txq_maybe_stop', '__netif_txq_completed_wake' helper macros Patch 13: * removed IDPF_HR_RESET_IN_PROG flag check in idpf_tx_singleq_start as it is defensive Patch 14: * removed max descriptor check as the core does that * removed unnecessary error messages * removed the stats that are common between the ones reported by ethtool and ip link * replaced snprintf with ethtool_sprintf * added a comment to explain the reason for the max queue check * as the netdev queues are set on alloc, there is no need to set them again on reset unless there is a queue change, so move the 'idpf_set_real_num_queues' to 'idpf_initiate_soft_reset' Patch 15: * reworded the 'configure SRIOV' in the commit message v5: https://lore.kernel.org/netdev/20230816004305.216136-1-anthony.l.nguyen@intel.com/ Most Patches: * wrapped line limit to 80 chars to those which don't effect readability Patch 12: * in skb_add_rx_frag, offset 'headlen' w.r.t page_offset when adding a frag to avoid adding the header again Patch 14: * added NULL check for 'rxq' when dereferencing it in page_pool_get_stats v4: https://lore.kernel.org/netdev/20230808003416.3805142-1-anthony.l.nguyen@intel.com/ Patch 1: * s/virtcnl/virtchnl * removed the kernel doc for the error code definitions that don't exist * reworded the summary part in the virtchnl2 header Patch 3: * don't set local variable to NULL on error * renamed sq_send_command_out label with err_unlock * don't use __GFP_ZERO in dma_alloc_coherent Patch 4: * introduced mailbox workqueue to process mailbox interrupts Patch 3, 4, 5, 6, 7, 8, 9, 11, 15: * removed unnecessary variable 0-init Patch 3, 5, 7, 8, 9, 15: * removed defensive programming checks wherever applicable * removed IDPF_CAP_FIELD_LAST as it can be treated as defensive programming Patch 3, 4, 5, 6, 7: * replaced IDPF_DFLT_MBX_BUF_SIZE with IDPF_CTLQ_MAX_BUF_LEN Patch 2 to 15: * add kernel-doc for idpf.h and idpf_txrx.h enums and structures Patch 4, 5, 15: * adjusted the destroy sequence of the workqueues as per the alloc sequence Patch 4, 5, 9, 15: * scrub unnecessary flags in 'idpf_flags' - IDPF_REMOVE_IN_PROG flag can take care of the cases where IDPF_REL_RES_IN_PROG is used, removed the later one - IDPF_REQ_[TX|RX]_SPLITQ are replaced with struct variables - IDPF_CANCEL_[SERVICE|STATS]_TASK are redundant as the work queue doesn't get rescheduled again after 'cancel_delayed_work_sync' - IDPF_HR_CORE_RESET is removed as there is no set_bit for this flag - IDPF_MB_INTR_TRIGGER is removed as it is not needed anymore with the mailbox workqueue implementation Patch 7 to 15: * replaced the custom buffer recycling code with page pool API * switched the header split buffer allocations from using a bunch of pages to using one large chunk of DMA memory * reordered some of the flows in vport_open to support page pool Patch 8, 12: * don't suppress the alloc errors by using __GFP_NOWARN Patch 9: * removed dyn_ctl_clrpba_m as it is not being used Patch 14: * introduced enum idpf_vport_reset_cause instead of using vport flags * introduced page pool stats v3: https://lore.kernel.org/netdev/20230616231341.2885622-1-anthony.l.nguyen@intel.com/ Patch 5: * instead of void, used 'struct virtchnl2_create_vport' type for vport_params_recvd and vport_params_reqd and removed the typecasting * used u16/u32 as needed instead of int for variables which cannot be negative and updated in all the places whereever applicable Patch 6: * changed the commit message to "add ptypes and MAC filter support" * used the sender Signed-off-by as the last tag on all the patches * removed unnecessary variables 0-init * instead of fixing the code in this commit, fixed it in the commit where the change was introduced first * moved get_type_info struct on to the stack instead of memory alloc * moved mutex_lock and ptype_info memory alloc outside while loop and adjusted the return flow * used 'break' instead of 'continue' in ptype id switch case v2: https://lore.kernel.org/netdev/20230614171428.1504179-1-anthony.l.nguyen@intel.com/ Patch 2: * added "Intel(R)" to the DRV_SUMMARY and Makefile. Patch 4, 5, 6, 15: * replaced IDPF_VC_MSG_PENDING flag with mutex 'vc_buf_lock' for the adapter related virtchnl opcodes. * get the mutex lock in the virtchnl send thread itself instead of in receive thread. Patch 5, 6, 7, 8, 9, 11, 14, 15: * replaced IDPF_VPORT_VC_MSG_PENDING flag with mutex 'vc_buf_lock' for the vport related virtchnl opcodes. * get the mutex lock in the virtchnl send thread itself instead of in receive thread. Patch 6: * converted get_ptype_info logic from 1:N to 1:1 message exchange for better handling of mutex lock. Patch 15: * introduced 'stats_lock' spinlock to avoid concurrent stats update. v1: https://lore.kernel.org/netdev/20230530234501.2680230-1-anthony.l.nguyen@intel.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * idpf: add SRIOV support and other ndo_opsJoshua Hay2023-09-1310-3/+1114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for SRIOV: send the requested number of VFs to the device Control Plane, via the virtchnl message and then enable the VFs using 'pci_enable_sriov'. Add other ndo ops supported by the driver such as features_check, set_rx_mode, validate_addr, set_mac_address, change_mtu, get_stats64, set_features, and tx_timeout. Initialize the statistics task which requests the queue related statistics to the CP. Add loopback and promiscuous mode support and the respective virtchnl messages. Finally, add documentation and build support for the driver. Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Co-developed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * idpf: add ethtool callbacksAlan Brady2023-09-137-3/+1851
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initialize all the ethtool ops that are supported by the driver and add the necessary support for the ethtool callbacks. Also add asynchronous link notification virtchnl support where the device Control Plane sends the link status and link speed as an asynchronous event message. Driver report the link speed on ethtool .idpf_get_link_ksettings query. Introduce soft reset function which is used by some of the ethtool callbacks such as .set_channels, .set_ringparam etc. to change the existing queue configuration. It deletes the existing queues by sending delete queues virtchnl message to the CP and calls the 'vport_stop' flow which disables the queues, vport etc. New set of queues are requested to the CP and reconfigure the queue context by calling the 'vport_open' flow. Soft reset flow also adjusts the number of vectors associated to a vport if .set_channels is called. Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Alice Michael <alice.michael@intel.com> Signed-off-by: Alice Michael <alice.michael@intel.com> Co-developed-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Co-developed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * idpf: add singleq start_xmit and napi pollJoshua Hay2023-09-138-31/+1290
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the start_xmit, TX and RX napi poll support for the single queue model. Unlike split queue model, single queue uses same queue to post buffer descriptors and completed descriptors. Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Co-developed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * idpf: add RX splitq napi poll supportAlan Brady2023-09-134-6/+892
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to handle interrupts for the RX completion queue and RX buffer queue. When the interrupt fires on RX completion queue, process the RX descriptors that are received. Allocate and prepare the SKB with the RX packet info, for both data and header buffer. IDPF uses software maintained refill queues to manage buffers between RX queue producer and the buffer queue consumer. They are required in order to maintain a lockless buffer management system and are strictly software only constructs. Instead of updating the RX buffer queue tail with available buffers right after the clean routine, it posts the buffer ids to the refill queues, only to post them to the HW later. If the generic receive offload (GRO) is enabled in the capabilities and turned on by default or via ethtool, then HW performs the packet coalescing if certain criteria are met by the incoming packets and updates the RX descriptor. Similar to GRO, if generic checksum is enabled, HW computes the checksum and updates the respective fields in the descriptor. Add support to update the SKB fields with the GRO and the generic checksum received. Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Co-developed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * idpf: add TX splitq napi poll supportJoshua Hay2023-09-136-5/+926
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to handle the interrupts for the TX completion queue and process the various completion types. In the flow scheduling mode, the driver processes primarily buffer completions as well as descriptor completions occasionally. This mode supports out of order TX completions. To do so, HW generates one buffer completion per packet. Each of those completions contains the unique tag provided during the TX encoding which is used to locate the packet either on the TX buffer ring or in a hash table. The hash table is used to track TX buffer information so the descriptor(s) for a given packet can be reused while the driver is still waiting on the buffer completion(s). Packets end up in the hash table in one of 2 ways: 1) a packet was stashed during descriptor completion cleaning, or 2) because an out of order buffer completion was processed. A descriptor completion arrives only every so often and is primarily used to guarantee the TX descriptor ring can be reused without having to wait on the individual buffer completions. E.g. a descriptor completion for N+16 guarantees HW read all of the descriptors for packets N through N+15, therefore all of the buffers for packets N through N+15 are stashed into the hash table and the descriptors can be reused for more TX packets. Similarly, a packet can be stashed in the hash table because an out an order buffer completion was processed. E.g. processing a buffer completion for packet N+3 implies that HW read all of the descriptors for packets N through N+3 and they can be reused. However, the HW did not do the DMA yet. The buffers for packets N through N+2 cannot be freed, so they are stashed in the hash table. In either case, the buffer completions will eventually be processed for all of the stashed packets, and all of the buffers will be cleaned from the hash table. In queue based scheduling mode, the driver processes primarily descriptor completions and cleans the TX ring the conventional way. Finally, the driver triggers a TX queue drain after sending the disable queues virtchnl message. When the HW completes the queue draining, it sends the driver a queue marker packet completion. The driver determines when all TX queues have been drained and proceeds with the disable flow. With this, the driver can send TX packets and clean up the resources properly. Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Co-developed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * idpf: add splitq start_xmitJoshua Hay2023-09-135-1/+1107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add start_xmit support for split queue model. To start with, add the necessary checks to linearize the skb if it uses more number of buffers than the hardware supported limit. Stop the transmit queue if there are no enough descriptors available for the skb to use or if there we're going to potentially overrun the completion queue. Finally prepare the descriptor with all the required information and update the tail. Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Co-developed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * idpf: initialize interrupts and enable vportPavan Kumar Linga2023-09-1310-3/+1389
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To further continue 'vport open', initialize all the resources required for the interrupts. To start with, initialize the queue vector indices with the ones received from the device Control Plane. Now that all the TX and RX queues are initialized, map the RX descriptor and buffer queues as well as TX completion queues to the allocated vectors. Initialize and enable the napi handler for the napi polling. Finally, request the IRQs for the interrupt vectors from the stack and setup the interrupt handler. Once the interrupt init is done, send 'map queue vector', 'enable queues' and 'enable vport' virtchnl messages to the CP to complete the 'vport open' flow. Co-developed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * idpf: configure resources for RX queuesAlan Brady2023-09-138-12/+1705
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to the TX, RX also supports both single and split queue models. In single queue model, the same descriptor queue is used by SW to post buffer descriptors to HW and by HW to post completed descriptors to SW. In split queue model, "RX buffer queues" are used to pass descriptor buffers from SW to HW whereas "RX queues" are used to post the descriptor completions i.e. descriptors that point to completed buffers, from HW to SW. "RX queue group" is a set of RX queues grouped together and will be serviced by a "RX buffer queue group". IDPF supports 2 buffer queues i.e. large buffer (4KB) queue and small buffer (2KB) queue per buffer queue group. HW uses large buffers for 'hardware gro' feature and also if the packet size is more than 2KB, if not 2KB buffers are used. Add all the resources required for the RX queues initialization. Allocate memory for the RX queue and RX buffer queue groups. Initialize the software maintained refill queues for buffer management algorithm. Same like the TX queues, initialize the queue parameters for the RX queues and send the config RX queue virtchnl message to the device Control Plane. Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Alice Michael <alice.michael@intel.com> Signed-off-by: Alice Michael <alice.michael@intel.com> Co-developed-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Co-developed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * idpf: configure resources for TX queuesAlan Brady2023-09-136-0/+1410
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IDPF supports two queue models i.e. single queue which is a traditional queueing model as well as split queue model. In single queue model, the same descriptor queue is used by SW to post descriptors to the HW, HW to post completed descriptors to SW. In split queue model, "TX Queues" are used to pass buffers from SW to HW and "TX Completion Queues" are used to post descriptor completions from HW to SW. Device supports asymmetric ratio of TX queues to TX completion queues. Considering this, queue group mechanism is used i.e. some TX queues are grouped together which will be serviced by only one TX completion queue per TX queue group. Add all the resources required for the TX queues initialization. To start with, allocate memory for the TX queue groups, TX queues and TX completion queues. Then, allocate the descriptors for both TX and TX completion queues, and bookkeeping buffers for TX queues alone. Also, allocate queue vectors for the vport and initialize the TX queue related fields for each queue vector. Initialize the queue parameters such as q_id, q_type and tail register offset with the info received from the device control plane (CP). Once all the TX queues are configured, send config TX queue virtchnl message to the CP with all the TX queue context information. Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Alice Michael <alice.michael@intel.com> Signed-off-by: Alice Michael <alice.michael@intel.com> Co-developed-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Co-developed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * idpf: add ptypes and MAC filter supportPavan Kumar Linga2023-09-134-1/+837
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the virtchnl support to request the packet types. Parse the responses received from CP and based on the protocol headers, populate the packet type structure with necessary information. Initialize the MAC address and add the virtchnl support to add and del MAC address. Co-developed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Co-developed-by: Shailendra Bhatnagar <shailendra.bhatnagar@intel.com> Signed-off-by: Shailendra Bhatnagar <shailendra.bhatnagar@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * idpf: add create vport and netdev configurationPavan Kumar Linga2023-09-137-20/+1522
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the required support to create a vport by spawning the init task. Once the vport is created, initialize and allocate the resources needed for it. Configure and register a netdev for each vport with all the features supported by the device based on the capabilities received from the device Control Plane. Spawn the init task till all the default vports are created. Co-developed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Co-developed-by: Shailendra Bhatnagar <shailendra.bhatnagar@intel.com> Signed-off-by: Shailendra Bhatnagar <shailendra.bhatnagar@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * idpf: add core init and interrupt requestPavan Kumar Linga2023-09-139-3/+1471
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the mailbox is setup, add the necessary send and receive mailbox message framework to support the virtchnl communication between the driver and device Control Plane (CP). Add the core initialization. To start with, driver confirms the virtchnl version with the CP. Once that is done, it requests and gets the required capabilities and resources needed such as max vectors, queues etc. Based on the vector information received in 'VIRTCHNL2_OP_GET_CAPS', request the stack to allocate the required vectors. Finally add the interrupt handling mechanism for the mailbox queue and enable the interrupt. Note: Checkpatch issues a warning about IDPF_FOREACH_VPORT_VC_STATE and IDPF_GEN_STRING being complex macros and should be enclosed in parentheses but it's not the case. They are never used as a statement and instead only used to define the enum and array. Co-developed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Co-developed-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Co-developed-by: Shailendra Bhatnagar <shailendra.bhatnagar@intel.com> Signed-off-by: Shailendra Bhatnagar <shailendra.bhatnagar@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * idpf: add controlq init and reset checksJoshua Hay2023-09-1314-3/+1851
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the end of the probe, initialize and schedule the event workqueue. It calls the hard reset function where reset checks are done to find if the device is out of the reset. Control queue initialization and the necessary control queue support is added. Introduce function pointers for the register operations which are different between PF and VF devices. Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Co-developed-by: Shailendra Bhatnagar <shailendra.bhatnagar@intel.com> Signed-off-by: Shailendra Bhatnagar <shailendra.bhatnagar@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Co-developed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * idpf: add module register and probe functionalityPhani Burra2023-09-135-0/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the required support to register IDPF PCI driver, as well as probe and remove call backs. Enable the PCI device and request the kernel to reserve the memory resources that will be used by the driver. Finally map the BAR0 address space. Signed-off-by: Phani Burra <phani.r.burra@intel.com> Co-developed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Co-developed-by: Shailendra Bhatnagar <shailendra.bhatnagar@intel.com> Signed-off-by: Shailendra Bhatnagar <shailendra.bhatnagar@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Co-developed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * virtchnl: add virtchnl version 2 opsPavan Kumar Linga2023-09-132-0/+1724
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Virtchnl version 1 is an interface used by the current generation of foundational NICs to negotiate the capabilities and configure the HW resources such as queues, vectors, RSS LUT, etc between the PF and VF drivers. It is not extensible to enable new features supported in the next generation of NICs/IPUs and to negotiate descriptor types, packet types and register offsets. To overcome the limitations of the existing interface, introduce the virtchnl version 2 and add the necessary opcodes, structures, definitions, and descriptor formats. The driver also learns the data queue and other register offsets to use instead of hardcoding them. The advantage of this approach is that it gives the flexibility to modify the register offsets if needed, restrict the use of certain descriptor types and negotiate the supported packet types. Co-developed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Co-developed-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Co-developed-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Co-developed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | Merge branch 'loongson1-mac'David S. Miller2023-09-167-0/+452
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Keguang Zhang says: ==================== Move Loongson1 MAC arch-code to the driver dir In order to convert Loongson1 MAC platform devices to the devicetree nodes, Loongson1 MAC arch-code should be moved to the driver dir. Add dt-binding document and update MAINTAINERS file accordingly. In other words, this patchset is a preparation for converting Loongson1 platform devices to devicetree. Changelog V4 -> V5: Replace stmmac_probe_config_dt() with devm_stmmac_probe_config_dt() Replace stmmac_pltfr_probe() with devm_stmmac_pltfr_probe() Squash patch 4 into patch 2 and 3 V3 -> V4: Add Acked-by tag from Krzysztof Kozlowski Add "|" to description part Amend "phy-mode" property Drop ls1x_dwmac_syscon definition and its instances Drop three redundant fields from the ls1x_dwmac structure Drop the ls1x_dwmac_init() method. Update the dt-binding document entry of Loongson1 Ethernet Some minor improvements V2 -> V3: Split the DT-schema file into loongson,ls1b-gmac.yaml and loongson,ls1c-emac.yaml (suggested by Serge Semin) Change the compatibles to loongson,ls1b-gmac and loongson,ls1c-emac Rename loongson,dwmac-syscon to loongson,ls1-syscon Amend the title Add description Add Reviewed-by tag from Krzysztof Kozlowski Change compatibles back to loongson,ls1b-syscon and loongson,ls1c-syscon Determine the device ID by physical base address(suggested by Serge Semin) Use regmap instead of regmap fields Use syscon_regmap_lookup_by_phandle() Some minor fixes Update the entries of MAINTAINERS V1 -> V2: Leave the Ethernet platform data for now Make the syscon compatibles more specific Fix "clock-names" and "interrupt-names" property Rename the syscon property to "loongson,dwmac-syscon" Drop "phy-handle" and "phy-mode" requirement Revert adding loongson,ls1b-dwmac/loongson,ls1c-dwmac to snps,dwmac.yaml Fix the build errors due to CONFIG_OF being unset Change struct reg_field definitions to const Rename the syscon property to "loongson,dwmac-syscon" Add MII PHY mode for LS1C Improve the commit message ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: stmmac: Add glue layer for Loongson-1 SoCKeguang Zhang2023-09-164-0/+222
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This glue driver is created based on the arch-code implemented earlier with the platform-specific settings. Use syscon for SYSCON register access. And modify MAINTAINERS to add a new F: entry for this driver. Partially based on the previous work by Serge Semin. Signed-off-by: Keguang Zhang <keguang.zhang@gmail.com> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | dt-bindings: net: Add Loongson-1 Ethernet ControllerKeguang Zhang2023-09-163-0/+228
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add devicetree binding document for Loongson-1 Ethernet controller. And modify MAINTAINERS to add a new F: entry for Loongson1 dt-binding documents. Signed-off-by: Keguang Zhang <keguang.zhang@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | dt-bindings: mfd: syscon: Add compatibles for Loongson-1 sysconKeguang Zhang2023-09-161-0/+2
|/ / | | | | | | | | | | | | | | | | Add Loongson LS1B and LS1C compatibles for system controller. Signed-off-by: Keguang Zhang <keguang.zhang@gmail.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sfc: make coding style of PTP addresses consistent with coreAlex Austin2023-09-161-13/+14
| | | | | | | | | | | | | | | | | | | | Follow the style used in the core kernel (e.g. include/linux/etherdevice.h and include/linux/in6.h) for the PTP IPv6 and Ethernet addresses. No functional changes. Signed-off-by: Alex Austin <alex.austin@amd.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: ethernet: mtk_wed: do not assume offload callbacks are always setLorenzo Bianconi2023-09-161-15/+17
| | | | | | | | | | | | | | | | | | | | Check if wlan.offload_enable and wlan.offload_disable callbacks are set in mtk_wed_flow_add/mtk_wed_flow_remove since mt7996 will not rely on them. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: add truesize debug checks in skb_{add|coalesce}_rx_frag()Eric Dumazet2023-09-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | It can be time consuming to track driver bugs, that might be detected too late from this confusing warning in skb_try_coalesce() WARN_ON_ONCE(delta < len); Add sanity check in skb_add_rx_frag() and skb_coalesce_rx_frag() to better track bug origin for CONFIG_DEBUG_NET=y builds. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: use indirect call helpers for sk->sk_prot->release_cb()Eric Dumazet2023-09-161-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | When adding sk->sk_prot->release_cb() call from __sk_flush_backlog() Paolo suggested using indirect call helpers to take care of CONFIG_RETPOLINE=y case. It turns out Google had such mitigation for years in release_sock(), it is time to make this public :) Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tcp: indent an if statementDan Carpenter2023-09-151-1/+1
| | | | | | | | | | | | | | Indent this if statement one tab. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'icssg-half-duplex-support'David S. Miller2023-09-154-2/+38
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ==================== net: Add Half Duplex support for ICSSG Driver This series adds support for half duplex operation for ICSSG driver. In order to support half-duplex operation at 10M and 100M link speeds, the PHY collision detection signal (COL) should be routed to ICSSG GPIO pin (PRGx_PRU0/1_GPI10) so that firmware can detect collision signal and apply the CSMA/CD algorithm applicable for half duplex operation. A DT property, "ti,half-duplex-capable" is introduced for this purpose in the first patch of the series. If board has PHY COL pin conencted to PRGx_PRU1_GPIO10, this DT property can be added to eth node of ICSSG, MII port to support half duplex operation at that port. Second patch of the series configures driver to support half-duplex operation if the DT property "ti,half-duplex-capable" is enabled. This series addresses comments on [v2]. This series is based on the latest net-next/main. This series has no dependency. Changes from v1 to v2: *) Changed the description of "ti,half-duplex-capable" property as asked by Rob and Andrew to avoid confusion between capable and enable. Changes from v1 to v2: *) Dropped the RFC tag. *) Added RB tags of Andrew and Roger. [v1] https://lore.kernel.org/all/20230830113134.1226970-1-danishanwar@ti.com/ [v2] https://lore.kernel.org/all/20230911060200.2164771-1-danishanwar@ti.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: ti: icssg-prueth: Add support for half duplex operationMD Danish Anwar2023-09-153-2/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for half duplex operation at 10M and 100M link speeds for AM654x/AM64x devices. - Driver configures rand_seed, a random number, in DMEM HD_RAND_SEED_OFFSET field, which will be used by firmware for Back off time calculation. - Driver informs FW about half duplex link operation in DMEM PORT_LINK_SPEED_OFFSET field by setting bit 7 for 10/100M HD. Hence, the half duplex operation depends on board design the "ti,half-duplex-capable" property has to be enabled for ICSS-G ports if HW is capable to perform half duplex. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | dt-bindings: net: Add documentation for Half duplex support.MD Danish Anwar2023-09-151-0/+7
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to support half-duplex operation at 10M and 100M link speeds, the PHY collision detection signal (COL) should be routed to ICSSG GPIO pin (PRGx_PRU0/1_GPI10) so that firmware can detect collision signal and apply the CSMA/CD algorithm applicable for half duplex operation. A DT property, "ti,half-duplex-capable" is introduced for this purpose. If board has PHY COL pin conencted to PRGx_PRU1_GPIO10, this DT property can be added to eth node of ICSSG, MII port to support half duplex operation at that port. Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: dsa: rtl8366rb: Implement setting up link on CPU portLinus Walleij2023-09-151-9/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We auto-negotiate most ports in the RTL8366RB driver, but the CPU port is hard-coded to 1Gbit, full duplex, tx and rx pause. This isn't very nice. People may configure speed and duplex differently in the device tree. Actually respect the arguments passed to the function for the CPU port, which get passed properly after Russell's patch "net: dsa: realtek: add phylink_get_caps implementation" After this the link is still set up properly. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | octeontx2-pf: Enable PTP PPS output supportHariprasad Kelam2023-09-153-37/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PTP block supports generating PPS output signal on GPIO pin. This patch adds the support in the PTP PHC driver using standard periodic output interface. User can enable/disable/configure PPS by writing to the below sysfs entry echo perout.index start.sec start.nsec period.sec period.nsec > /sys/class/ptp/ptp0/period Example to generate 50% duty cycle PPS signal: echo 0 0 0 0 500000000 > /sys/class/ptp/ptp0/period Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'ipv6-data-races'David S. Miller2023-09-1524-259/+238
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eric Dumazet says: ==================== ipv6: round of data-races fixes This series is inspired by one related syzbot report. Many inet6_sk(sk) fields reads or writes are racy. Move 1-bit fields to inet->inet_flags to provide atomic safety. inet6_{test|set|clear|assign}_bit() helpers could be changed later if we need to make room in inet_flags. Also add missing READ_ONCE()/WRITE_ONCE() when lockless readers need access to specific fields. np->srcprefs will be handled separately to avoid merge conflicts because a prior patch was posted for net tree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: lockless IPV6_FLOWINFO_SEND implementationEric Dumazet2023-09-1513-23/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | np->sndflow reads are racy. Use one bit ftom atomic inet->inet_flags instead, IPV6_FLOWINFO_SEND setsockopt() can be lockless. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: lockless IPV6_MTU_DISCOVER implementationEric Dumazet2023-09-157-22/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most np->pmtudisc reads are racy. Move this 3bit field on a full byte, add annotations and make IPV6_MTU_DISCOVER setsockopt() lockless. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: lockless IPV6_ROUTER_ALERT_ISOLATE implementationEric Dumazet2023-09-154-11/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reads from np->rtalert_isolate are racy. Move this flag to inet->inet_flags to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: move np->repflow to atomic flagsEric Dumazet2023-09-156-15/+14
| | | | | | | | | | | | | | | | | | | | | | | | Move np->repflow to inet->inet_flags to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: lockless IPV6_RECVERR implemetationEric Dumazet2023-09-1511-32/+25
| | | | | | | | | | | | | | | | | | | | | | | | np->recverr is moved to inet->inet_flags to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: lockless IPV6_DONTFRAG implementationEric Dumazet2023-09-1511-17/+16
| | | | | | | | | | | | | | | | | | | | | | | | Move np->dontfrag flag to inet->inet_flags to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: lockless IPV6_AUTOFLOWLABEL implementationEric Dumazet2023-09-155-16/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | Move np->autoflowlabel and np->autoflowlabel_set in inet->inet_flags, to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: lockless IPV6_MULTICAST_ALL implementationEric Dumazet2023-09-155-11/+9
| | | | | | | | | | | | | | | | | | | | | | | | Move np->mc_all to an atomic flags to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: lockless IPV6_RECVERR_RFC4884 implementationEric Dumazet2023-09-154-11/+10
| | | | | | | | | | | | | | | | | | | | | | | | Move np->recverr_rfc4884 to an atomic flag to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: lockless IPV6_MINHOPCOUNT implementationEric Dumazet2023-09-151-16/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add one missing READ_ONCE() annotation in do_ipv6_getsockopt() and make IPV6_MINHOPCOUNT setsockopt() lockless. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: lockless IPV6_MTU implementationEric Dumazet2023-09-152-16/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | np->frag_size can be read/written without holding socket lock. Add missing annotations and make IPV6_MTU setsockopt() lockless. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: lockless IPV6_MULTICAST_HOPS implementationEric Dumazet2023-09-156-25/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes data-races around np->mcast_hops, and make IPV6_MULTICAST_HOPS lockless. Note that np->mcast_hops is never negative, thus can fit an u8 field instead of s16. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: lockless IPV6_MULTICAST_LOOP implementationEric Dumazet2023-09-158-25/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add inet6_{test|set|clear|assign}_bit() helpers. Note that I am using bits from inet->inet_flags, this might change in the future if we need more flags. While solving data-races accessing np->mc_loop, this patch also allows to implement lockless accesses to np->mcast_hops in the following patch. Also constify sk_mc_loop() argument. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: lockless IPV6_UNICAST_HOPS implementationEric Dumazet2023-09-156-24/+16
|/ / | | | | | | | | | | | | | | | | | | Some np->hop_limit accesses are racy, when socket lock is not held. Add missing annotations and switch to full lockless implementation. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni2023-09-14948-11119/+23357
|\ \ | | | | | | | | | | | | | | | | | | | | | Cross-merge networking fixes after downstream PR. No conflicts. Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * \ Merge tag 'net-6.6-rc2' of ↵Linus Torvalds2023-09-1436-207/+352
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Quite unusually, this does not contains any fix coming from subtrees (nf, ebpf, wifi, etc). Current release - regressions: - bcmasp: fix possible OOB write in bcmasp_netfilt_get_all_active() Previous releases - regressions: - ipv4: fix one memleak in __inet_del_ifa() - tcp: fix bind() regressions for v4-mapped-v6 addresses. - tls: do not free tls_rec on async operation in bpf_exec_tx_verdict() - dsa: fixes for SJA1105 FDB regressions - veth: update XDP feature set when bringing up device - igb: fix hangup when enabling SR-IOV Previous releases - always broken: - kcm: fix memory leak in error path of kcm_sendmsg() - smc: fix data corruption in smcr_port_add - microchip: fix possible memory leak for vcap_dup_rule()" * tag 'net-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits) kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg(). net: renesas: rswitch: Add spin lock protection for irq {un}mask net: renesas: rswitch: Fix unmasking irq condition igb: clean up in all error paths when enabling SR-IOV ixgbe: fix timestamp configuration code selftest: tcp: Add v4-mapped-v6 cases in bind_wildcard.c. selftest: tcp: Move expected_errno into each test case in bind_wildcard.c. selftest: tcp: Fix address length in bind_wildcard.c. tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address. tcp: Fix bind() regression for v4-mapped-v6 wildcard address. tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any). ipv6: fix ip6_sock_set_addr_preferences() typo veth: Update XDP feature set when bringing up device net: macb: fix sleep inside spinlock net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict() net: ethernet: mtk_eth_soc: fix pse_port configuration for MT7988 net: ethernet: mtk_eth_soc: fix uninitialized variable kcm: Fix memory leak in error path of kcm_sendmsg() r8152: check budget for r8152_poll() net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset ...
| | * | kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg().Kuniyuki Iwashima2023-09-141-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzkaller found a memory leak in kcm_sendmsg(), and commit c821a88bd720 ("kcm: Fix memory leak in error path of kcm_sendmsg()") suppressed it by updating kcm_tx_msg(head)->last_skb if partial data is copied so that the following sendmsg() will resume from the skb. However, we cannot know how many bytes were copied when we get the error. Thus, we could mess up the MSG_MORE queue. When kcm_sendmsg() fails for SOCK_DGRAM, we should purge the queue as we do so for UDP by udp_flush_pending_frames(). Even without this change, when the error occurred, the following sendmsg() resumed from a wrong skb and the queue was messed up. However, we have yet to get such a report, and only syzkaller stumbled on it. So, this can be changed safely. Note this does not change SOCK_SEQPACKET behaviour. Fixes: c821a88bd720 ("kcm: Fix memory leak in error path of kcm_sendmsg()") Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230912022753.33327-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| | * | Merge branch 'net-renesas-rswitch-fix-a-lot-of-redundant-irq-issue'Paolo Abeni2023-09-142-4/+18
| | |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yoshihiro Shimoda says: ==================== net: renesas: rswitch: Fix a lot of redundant irq issue After this patch series was applied, a lot of redundant interrupts no longer occur. For example: when "iperf3 -c <ipaddr> -R" on R-Car S4-8 Spider Before the patches are applied: about 800,000 times happened After the patches were applied: about 100,000 times happened ==================== Link: https://lore.kernel.org/r/20230912014936.3175430-1-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>