linux-stable.git - Linux kernel stable tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	net/sched: cls_matchall: Fix error path	Yotam Gigi	2017-01-03	1	-6/+16
\| \| \| \| \| \| \| \| \| \| \|	Fix several error paths in matchall: - Release reference to actions in case the hardware fails offloading (relevant to skip_sw only) - Fix error path in case tcf_exts initialization/validation fail Fixes: bf3994d2ed31 ("net/sched: introduce Match-all classifier") Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch '10GbE' of ↵	David S. Miller	2017-01-03	20	-199/+1101
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2017-01-03 This series contains updates to ixgbe and ixgbevf only. Emil fixes ixgbe to use the NVM settings for FEC, so do not override the settings. Fixed the indirection table for x550, where newer devices can support up to 64 RSS queues. Extends the rtnl_lock() to protect the call to netif_device_detach() and ixgbe_clear_interrupt_scheme() to avoid against a double free WARN and/or a BUG in free_msi_irqs(). Fixed AER error handling by making sure that the driver frees the IRQs in ixgbe_io_error_detected() when responding to a PCIe AER error, and to restore them when the interface recovers. Tony updates the driver to report the driver version to the firmware using the host interface command for x550 devices. Fixed the PHY reset check for x550em_ext_t PHY type. Fixed bounds checking for x540 devices to ensure the index is valid for the LED function. Fixed the BaseT adapters which support 100Mb capability and were not reporting the capability. Ken Cox adds a missing check for the trusted bit before trying to set the MACVLAN MAC address. Yusuke Suzuki fixes an issue with 82599 and x540 devices where receive timestamps were not working becase the bitwise operation for RX_HWSTAMP falg was incorrect. Don ensures that x553 KR/KX devices correctly advertise link speeds. Adds the mailbox message to allow for VF promiscuous mode support. Mark fixes two issues with EEPROM access, where the semaphore was not being held until the entire response was read and the acquiring/releasing of the semaphore is slow. Cleaned up firmware version method and functions which are no longer used. Added new interfaces for firmware commands to access some new PHYs. v2: fixed tab indentation in patch 12 and mis-spelled words in patch 15 based on feedback from Sergei Shtylyov and Rami Rosen. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ixgbe: Add PF support for VF promiscuous mode	Don Skidmore	2017-01-03	4	-6/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch extends the xcast mailbox message to include support for unicast promiscuous mode. To allow a VF to enter this mode the PF must be in promiscuous mode. A later patch will add the support needed in the VF driver (ixgbevf) Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbevf: Add support for VF promiscuous mode	Don Skidmore	2017-01-03	4	-5/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch extends the mailbox message to allow for VF promiscuous mode support. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: Implement support for firmware-controlled PHYs	Mark Rustad	2017-01-03	7	-6/+642
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement support for devices that have firmware-controlled PHYs. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: Implement firmware interface to access some PHYs	Mark Rustad	2017-01-03	3	-0/+113
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement new interface for firmware commands to access some PHYs. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: Remove unused firmware version functions and method	Mark Rustad	2017-01-03	7	-46/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The firmware version method and functions are not used anywhere, so remove them all. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: Fix issues with EEPROM access	Mark Rustad	2017-01-03	3	-82/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two problems with EEPROM access. One is that it needs to hold the semaphore until the entire response is read or else the response can be corrupted by other firmware accesses. The second problem is that acquiring and releasing the semaphore is slow, so it should be taken and released once when multiple EEPROM accesses will be done. Both of these issues can be solved by adding a new function, ixgbe_hic_unlocked, to issue firmware commands that will assume that the caller has acquired the needed semaphore. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: Configure advertised speeds correctly for KR/KX backplane	Don Skidmore	2017-01-03	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch ensures that the advertised link speeds are configured for X553 KR/KX backplane. Without this patch the link remains at 1G when resuming from low power after being downshifted by LPLU. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbevf: restore hw_addr on resume or error	Emil Tantilov	2017-01-03	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Restore adapter->hw.hw_addr after handling an error, or a resume operation to make sure we can access the registers. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: Fix incorrect bitwise operations of PTP Rx timestamp flags	Yusuke Suzuki	2017-01-03	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rx timestamp does not work on 82599 and X540 because bitwise operation of RX_HWTSTAMP flags is incorrect and ixgbe_ptp_rx_hwtstamp() is never called. This patch fixes it to enable Rx timestamp on 82599 and X540. Without this fix: ptp4l[278.730]: selected /dev/ptp8 as PTP clock ptp4l[278.733]: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l[278.733]: port 0: INITIALIZING to LISTENING on INITIALIZE ptp4l[278.834]: port 1: received SYNC without timestamp ptp4l[278.835]: port 1: new foreign master 1c3947.fffe.60f9cc-1 ptp4l[279.834]: port 1: received SYNC without timestamp ptp4l[280.834]: port 1: received SYNC without timestamp ptp4l[281.834]: port 1: received SYNC without timestamp ptp4l[282.834]: port 1: received SYNC without timestamp ptp4l[282.835]: selected best master clock 1c3947.fffe.60f9cc ptp4l[282.835]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[283.834]: port 1: received SYNC without timestamp With this fix: ptp4l[239.154]: selected /dev/ptp8 as PTP clock ptp4l[239.157]: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l[239.157]: port 0: INITIALIZING to LISTENING on INITIALIZE ptp4l[240.989]: port 1: new foreign master 1c3947.fffe.60f9cc-1 ptp4l[244.989]: selected best master clock 1c3947.fffe.60f9cc ptp4l[244.989]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[246.977]: master offset -899583339542096 s0 freq +0 path delay 16222 ptp4l[247.977]: master offset -899583339617265 s1 freq -75169 path delay 16177 ptp4l[248.977]: master offset -130 s2 freq -75299 path delay 16177 ptp4l[248.977]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[249.977]: master offset -9 s2 freq -75217 path delay 16177 ptp4l[250.977]: master offset 88 s2 freq -75123 path delay 16132 Fixes: a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices") Signed-off-by: Yusuke Suzuki <yus-suzuki@uf.jp.nec.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbevf: fix AER error handling	Emil Tantilov	2017-01-03	1	-18/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure that we free the IRQs in ixgbevf_io_error_detected() when responding to an PCIe AER error and also restore them when the interface recovers from it. Previously it was possible to trigger BUG_ON() check in free_msix_irqs() in the case where we call ixgbevf_remove() after a failed recovery from AER error because the interrupts were not freed. Also moved the down and free functions into ixgbevf_close_suspend() same as with ixgbe. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: fix AER error handling	Emil Tantilov	2017-01-03	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure that we free the IRQs in ixgbe_io_error_detected() when responding to an PCIe AER error and also restore them when the interface recovers from it. Previously it was possible to trigger BUG_ON() check in free_msix_irqs() in the case where we call ixgbe_remove() after a failed recovery from AER error because the interrupts were not freed. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: test for trust in macvlan adjustments for VF	Ken Cox	2017-01-03	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two methods for setting mac addresses in a Macvlan, that differentiate themselves in the function macvlan_set_mac_Address. If the macvlan mode is passthru, then we use the dev_set_mac_address method, otherwise we use the dev_uc api via macvlan_sync_addresses. The latter method (which would stem from using any non-passthru mode, like bridge, or vepa), calls down into the driver in a path that terminates in ixgbevf_set_uc_addr_vf, which sends a IXGBE_VF_SET_MACVLAN message, which causes the pf to spawn the noted error message. This occurs because it appears that the guest is trying to delete the mac address of the macvlan before adding another. The other path in macvlan_set_mac_address uses dev_set_mac_address, which calls into ixgbevf_set_mac which uses the IXGBE_VF_SET_MAC_ADDR to the pf to set the macvlan mac address. The discrepancy here is in the handlers. The handler function for IXGBE_VF_SET_MAC_ADDR (ixgbe_set_vf_mac_addr) has a check for the vfinfo[].trusted bit to allow the operation if the vf is trusted. In comparison, the IXGBE_VF_SET_MACVLAN message handler (ixgbe_set_vf_macvlan_msg) has no such check of the trusted bit. Signed-off-by: Ken Cox <jkc@redhat.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbevf: handle race between close and suspend on shutdown	Emil Tantilov	2017-01-03	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an interface is part of a namespace it is possible that ixgbevf_close() may be called while ixgbevf_suspend() is running which ends up in a double free WARN and/or a BUG in free_msi_irqs() To handle this situation we extend the rtnl_lock() to protect the call to netif_device_detach() and check for !netif_device_present() to avoid entering close while in suspend. Also added rtnl locks to ixgbevf_queue_reset_subtask(). CC: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: handle close/suspend race with netif_device_detach/present	Emil Tantilov	2017-01-03	1	-9/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an interface is part of a namespace it is possible that ixgbe_close() may be called while __ixgbe_shutdown() is running which ends up in a double free WARN and/or a BUG in free_msi_irqs(). To handle this situation we extend the rtnl_lock() to protect the call to netif_device_detach() and ixgbe_clear_interrupt_scheme() in __ixgbe_shutdown() and check for netif_device_present() to avoid clearing the interrupts second time in ixgbe_close(); Also extend the rtnl lock in ixgbe_resume() to netif_device_attach(). Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: Fix reporting of 100Mb capability	Tony Nguyen	2017-01-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	BaseT adapters that are capable of supporting 100Mb are not reporting this capability. This patch corrects the reporting so that 100Mb is shown as supported on those adapters. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: Reduce I2C retry count on X550 devices	Tony Nguyen	2017-01-03	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A retry count of 10 is likely to run into problems on X550 devices that have to detect and reset unresponsive CS4227 devices. So, reduce the I2C retry count to 3 for X550 and above. This should avoid any possible regressions in existing devices. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: Add bounds check for x540 LED functions	Tony Nguyen	2017-01-03	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is an extension of commit 003287e0f087 ("ixgbevf: Correct parameter sent to LED function"); add bounds checking to x540 functions to ensure the index is valid. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: add mask for 64 RSS queues	Emil Tantilov	2017-01-03	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The indirection table was reported incorrectly for X550 and newer where we can support up to 64 RSS queues. Reported-by Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: Fix check for ixgbe_phy_x550em_ext_t reset	Tony Nguyen	2017-01-03	1	-4/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The generic PHY reset check we had previously is not sufficient for the ixgbe_phy_x550em_ext_t PHY type. Check 1.CC02.0 instead - same as ixgbe_init_ext_t_x550(). Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: Report driver version to firmware for x550 devices	Tony Nguyen	2017-01-03	5	-7/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some x550 devices require the driver version reported to its firmware; this patch sends the driver version string to the firmware through the host interface command for x550 devices. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| *	ixgbe: do not disable FEC from the driver	Emil Tantilov	2017-01-03	1	-2/+0
\|/ \| \| \| \| \| \| \| \|	FEC is configured by the NVM and the driver should not be overriding it. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
*	Merge branch 'tipc-link-starvation'	David S. Miller	2017-01-03	7	-349/+319
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Jon Maloy says: ==================== tipc: improve interaction socket-link We fix a very real starvation problem that may occur when a link encounters send buffer congestion. At the same time we make the interaction between the socket and link layer simpler and more consistent. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: reduce risk of user starvation during link congestion	Jon Paul Maloy	2017-01-03	5	-251/+194
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The socket code currently handles link congestion by either blocking and trying to send again when the congestion has abated, or just returning to the user with -EAGAIN and let him re-try later. This mechanism is prone to starvation, because the wakeup algorithm is non-atomic. During the time the link issues a wakeup signal, until the socket wakes up and re-attempts sending, other senders may have come in between and occupied the free buffer space in the link. This in turn may lead to a socket having to make many send attempts before it is successful. In extremely loaded systems we have observed latency times of several seconds before a low-priority socket is able to send out a message. In this commit, we simplify this mechanism and reduce the risk of the described scenario happening. When a message is attempted sent via a congested link, we now let it be added to the link's backlog queue anyway, thus permitting an oversubscription of one message per source socket. We still create a wakeup item and return an error code, hence instructing the sender to block or stop sending. Only when enough space has been freed up in the link's backlog queue do we issue a wakeup event that allows the sender to continue with the next message, if any. The fact that a socket now can consider a message sent even when the link returns a congestion code means that the sending socket code can be simplified. Also, since this is a good opportunity to get rid of the obsolete 'mtu change' condition in the three socket send functions, we now choose to refactor those functions completely. Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: modify struct tipc_plist to be more versatile	Jon Paul Maloy	2017-01-03	3	-46/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During multicast reception we currently use a simple linked list with push/pop semantics to store port numbers. We now see a need for a more generic list for storing values of type u32. We therefore make some modifications to this list, while replacing the prefix 'tipc_plist_' with 'u32_'. We also add a couple of new functions which will come to use in the next commits. Acked-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: unify tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() functions	Jon Paul Maloy	2017-01-03	1	-59/+49
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The functions tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() are very similar. The latter function is also called from two locations, and there will be more in the coming commits, which will all need to test on different conditions. Instead of making yet another duplicates of the function, we now introduce a new macro tipc_wait_for_cond() where the wakeup condition can be stated as an argument to the call. This macro replaces all current and future uses of the two functions, which can now be eliminated. Acked-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'TPACKET_V3-TX_RING-support'	David S. Miller	2017-01-03	3	-28/+111
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sowmini Varadhan says: ==================== TPACKET_V3 TX_RING support This patch series allows an application to use a single PF_PACKET descriptor and leverage the best implementations of TX_RING and RX_RING that exist today. Patch 1 adds the kernel/Documentation changes for TX_RING support and patch2 adds the associated test case in selftests. Changes since v2: additional sanity checks for setsockopt input for TX_RING/TPACKET_V3. Refactored psock_tpacket.c test code to avoid code duplication from V2. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tools: test case for TPACKET_V3/TX_RING support	Sowmini Varadhan	2017-01-03	1	-17/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a test case and sample code for (TPACKET_V3, PACKET_TX_RING) Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	af_packet: TX_RING support for TPACKET_V3	Sowmini Varadhan	2017-01-03	2	-11/+37
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Although TPACKET_V3 Rx has some benefits over TPACKET_V2 Rx, _v3 does not currently have TX_RING support. As a result an application that wants the best perf for Tx and Rx (e.g. to handle request/response transacations) ends up needing 2 sockets, one with _v2 for Tx and another with *_v3 for Rx. This patch enables TPACKET_V2 compatible Tx features in TPACKET_V3 so that an application can use a single descriptor to get the benefits of _v3 RX_RING and _v2 TX_RING. An application may do a block-send by first filling up multiple frames in the Tx ring and then triggering a transmit. This patch only support fixed size Tx frames for TPACKET_V3, and requires that tp_next_offset must be zero. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	sfc-falcon: declare module version (same as ethtool drvinfo version)	Edward Cree	2017-01-03	1	-0/+1
\| \| \| \| \|	Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	sfc: declare module version (same as ethtool drvinfo version)	Edward Cree	2017-01-03	1	-0/+1
\| \| \| \| \|	Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipmr, ip6mr: add RTNH_F_UNRESOLVED flag to unresolved cache entries	Nikolay Aleksandrov	2017-01-03	3	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	While working with ipmr, we noticed that it is impossible to determine if an entry is actually unresolved or its IIF interface has disappeared (e.g. virtual interface got deleted). These entries look almost identical to user-space when dumping or receiving notifications. So in order to recognize them add a new RTNH_F_UNRESOLVED flag which is set when sending an unresolved cache entry to user-space. Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	dsa:mv88e6xxx: allow address 0x1 in smi_init	Volodymyr Bendiuga	2017-01-03	1	-4/+0
\| \| \| \| \| \| \| \| \| \|	Some devices, such as the mv88e6097 do have ADDR[0] external and so it is possible to configure the device to use SMI address 0x1. Remove the restriction, as there are boards using this address. Signed-off-by: Volodymyr Bendiuga <volodymyr.bendiuga@westermo.se> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: freescale: dpaa: use new api ethtool_{get\|set}_link_ksettings	Philippe Reynes	2017-01-03	1	-9/+9
\| \| \| \| \| \| \| \|	The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'for_4.11/net-next/rds_v3' of ↵	David S. Miller	2017-01-03	18	-65/+335
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux Santosh Shilimkar says: ==================== net: RDS updates v2->v3: - Re-based against latest net-next head. - Dropped a user visible change after discussing with David Miller. It needs some more work to fully support old/new tools matrix. - Addressed Dave's comment about bool usage in patch "RDS: IB: track and log active side..." v1->v2: Re-aligned indentation in patch 'RDS: mark few internal functions..." Series consist of: - RDMA transport fixes for map failure, listen sequence, handler panic and composite message notification. - Couple of sparse fixes. - Message logging improvements for bind failure, use once mr semantics and connection remote address, active end point. - Performance improvement for RDMA transport by reducing the post send pressure on the queue and spreading the CQ vectors. - Useful statistics for socket send/recv usage and receive cache usage. - Additional RDS CMSG used by application to track the RDS message stages for certain type of traffic to find out latency spots. Can be enabled/disabled per socket. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	RDS: add receive message trace used by application	Santosh Shilimkar	2017-01-02	6	-3/+109
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Socket option to tap receive path latency in various stages in nano seconds. It can be enabled on selective sockets using using SO_RDS_MSG_RXPATH_LATENCY socket option. RDS will return the data to application with RDS_CMSG_RXPATH_LATENCY in defined format. Scope is left to add more trace points for future without need of change in the interface. Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
\| *	RDS: make message size limit compliant with spec	Avinash Repaka	2017-01-02	3	-1/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RDS support max message size as 1M but the code doesn't check this in all cases. Patch fixes it for RDMA & non-RDMA and RDS MR size and its enforced irrespective of underlying transport. Signed-off-by: Avinash Repaka <avinash.repaka@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
\| *	RDS: add stat for socket recv memory usage	Venkat Venkatsubra	2017-01-02	2	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tracks the receive side memory added to scokets and removed from sockets. Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
\| *	RDS: IB: fix panic due to handlers running post teardown	Santosh Shilimkar	2017-01-02	2	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Shutdown code reaping loop takes care of emptying the CQ's before they being destroyed. And once tasklets are killed, the hanlders are not expected to run. But because of core tasklet code issues, tasklet handler could still run even after tasklet_kill, RDS IB shutdown code already reaps the CQs before freeing cq/qp resources so as such the handlers have nothing left to do post shutdown. On other hand any handler running after teardown and trying to access already freed qp/cq resources causes issues Patch fixes this race by makes sure that handlers returns without any action post teardown. Reviewed-by: Wengang <wen.gang.wang@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
\| *	RDS: RDMA: Fix the composite message user notification	Santosh Shilimkar	2017-01-02	4	-11/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When application sends an RDS RDMA composite message consist of RDMA transfer to be followed up by non RDMA payload, it expect to be notified only when the full message gets delivered. RDS RDMA notification doesn't behave this way though. Thanks to Venkat for debug and root casuing the issue where only first part of the message(RDMA) was successfully delivered but remainder payload delivery failed. In that case, application should not be notified with a false positive of message delivery success. Fix this case by making sure the user gets notified only after the full message delivery. Reviewed-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
\| *	RDS: IB: Add vector spreading for cqs	Santosh Shilimkar	2017-01-02	3	-3/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Based on available device vectors, allocate cqs accordingly to get better spread of completion vectors which helps performace great deal.. Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
\| *	RDS: IB: add few useful cache stasts	Santosh Shilimkar	2017-01-02	3	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \|	Tracks the ib receive cache total, incoming and frag allocations. Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
\| *	RDS: IB: track and log active side endpoint in connection	Santosh Shilimkar	2017-01-02	2	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Useful to know the active and passive end points in a RDS IB connection. Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
\| *	RDS: RDMA: silence the use_once mr log flood	Santosh Shilimkar	2017-01-02	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In absence of extension headers, message log will keep flooding the console. As such even without use_once we can clean up the MRs so its not really an error case message so make it debug message Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
\| *	RDS: IB: split the mr registration and invalidation path	Santosh Shilimkar	2017-01-02	3	-8/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MR invalidation in RDS is done in background thread and not in data path like registration. So break the dependency between them which helps to remove the performance bottleneck. Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
\| *	RDS: RDMA: return appropriate error on rdma map failures	Santosh Shilimkar	2017-01-02	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The first message to a remote node should prompt a new connection even if it is RDMA operation. For RDMA operation the MR mapping can fail because connections is not yet up. Since the connection establishment is asynchronous, we make sure the map failure because of unavailable connection reach to the user by appropriate error code. Before returning to the user, lets trigger the connection so that its ready for the next retry. Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
\| *	RDS: RDMA: start rdma listening after init	Qing Huang	2017-01-02	1	-8/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This prevents RDS from handling incoming rdma packets before RDS completes initializing its recv/send components. Signed-off-by: Qing Huang <qing.huang@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
\| *	RDS: RDMA: fix the ib_map_mr_sg_zbva() argument	Santosh Shilimkar	2017-01-02	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Fixes warning: Using plain integer as NULL pointer Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
\| *	RDS: IB: make the transport retry count smallest	Santosh Shilimkar	2017-01-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Transport retry is not much useful since it indicate packet loss in fabric so its better to failover fast rather than longer retry. Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>