summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* net: sch: red: Change offloaded xstats to be incrementalNogah Frankel2018-01-102-19/+20
| | | | | | | | | | | | Change the value of the xstats requested from the driver for offloaded RED to be incremental, like the normal stats. It increases consistency - if a qdisc stops being offloaded its xstats don't change. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: sch: red: Change the name of the stats struct to be genericNogah Frankel2018-01-102-6/+7
| | | | | | | | | | Change the name of the stats struct to be generic, so it could be used for other qdisc offload, that will be added in the next patches. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlxsw: spectrum: qdiscs: Move qdisc's declarations to its designated fileNogah Frankel2018-01-103-25/+51
| | | | | | | | | | Move all the qdisc related data from the spectrum.h to spectrum_qdisc.c. Create an init and fini functions for the qdiscs. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlxsw: spectrum: Fix typo in firmware upgrade messageIdo Schimmel2018-01-101-1/+1
| | | | | | Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: make local function tcp_recv_timestamp staticWei Yongjun2018-01-101-2/+2
| | | | | | | | | | Fixes the following sparse warning: net/ipv4/tcp.c:1736:6: warning: symbol 'tcp_recv_timestamp' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx5e: fix error return code in mlx5e_alloc_rq()Wei Yongjun2018-01-101-1/+2
| | | | | | | | | | Fix to return a negative error code from the xdp_rxq_info_reg() error handling case instead of 0, as done elsewhere in this function. Fixes: 0ddf543226ac ("xdp/mlx5: setup xdp_rxq_info") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4vf: Fix SGE FL buffer initialization logic for 64K pagesArjun Vynipadath2018-01-101-6/+17
| | | | | | | | | | | | We'd come in with SGE_FL_BUFFER_SIZE[0] and [1] both equal to 64KB and the extant logic would flag that as an error. This was already fixed in cxgb4 driver with "92ddcc7 cxgb4: Fix some small bugs in t4_sge_init_soft() when our Page Size is 64KB". Original Work by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tuntap: fix for "tuntap: XDP transmission"Stephen Rothwell2018-01-101-2/+2
| | | | | | | Fixes: fc72d1d54dd9 ("tuntap: XDP transmission") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: fix xdp_rxq_info build issue when CONFIG_SYSFS is not setJesper Dangaard Brouer2018-01-101-3/+0
| | | | | | | | | | | | The commit e817f85652c1 ("xdp: generic XDP handling of xdp_rxq_info") removed some ifdef CONFIG_SYSFS in net/core/dev.c, but forgot to remove the corresponding ifdef's in include/linux/netdevice.h. Fixes: e817f85652c1 ("xdp: generic XDP handling of xdp_rxq_info") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: phy: marvell: mv88e6390 temperature sensor readingAndrew Lunn2018-01-101-1/+150
| | | | | | | | | | | The internal PHYs in the mv88e6390 switch have a temperature sensor. It uses a different register layout to other PHY currently supported. It also has an errata, in that some reads of the sensor result in bad values. So a number of reads need to be made, and the average taken. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'net-create-dynamic-software-irq-moderation-library'David S. Miller2018-01-1015-411/+592
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andy Gospodarek says: ==================== net: create dynamic software irq moderation library This converts the dynamic interrupt moderation library from the mlx5e driver into a library so it can be used by any driver. The penultimate patch in this set adds support for this new dynamic interrupt moderation library in the bnxt_en driver and the last patch creates an entry in the MAINTAINERS file for this library. The main purpose of this code is to allow an administrator to make sure that default coalesce settings are optimized for low latency, but quickly adapt to handle high throughput/bulk traffic by altering how much time passes before popping an interrupt. For any new driver the following changes would be needed to use this library: - add elements in ring struct to track items needed by this library - create function that can be called to actually set coalesce settings for the driver Credit to Rob Rice and Lee Reed for doing some of the initial proof of concept and testing for this patch and Tal Gilboa and Or Gerlitz for their comments, etc on this set. v4: Fix build breakage for VF representers noticed by kbuild test robot. Thanks for being so courteous, kbuild test robot! v3: bnxt_en fix from Michael Chan, comment suggestion from Vasundhara Volam, and small mlx5e header file fix from Tal Gilboa. v2: Spelling fixes from Stephen Hemminger, bnxt_en suggestions from Michael Chan, spelling and formatting fixes from Or Gerlitz, and spelling and mlx5e changes suggested by Tal Gilboa. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * MAINTAINERS: add entry for Dynamic Interrupt ModerationAndy Gospodarek2018-01-101-0/+5
| | | | | | | | | | | | | | Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bnxt_en: add support for software dynamic interrupt moderationAndy Gospodarek2018-01-105-12/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | This implements the changes needed for the bnxt_en driver to add support for dynamic interrupt moderation per ring. This does add additional counters in the receive path, but testing shows that any additional instructions are offset by throughput gain when the default configuration is for low latency. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/dim: use struct net_dim_sample as arg to net_dimAndy Gospodarek2018-01-102-12/+11
| | | | | | | | | | | | | | | | | | | | | | Simplify the arguments net_dim() by formatting them into a struct net_dim_sample before calling the function. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Suggested-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5e: Move dynamic interrupt coalescing code to include/linuxAndy Gospodarek2018-01-105-134/+97
| | | | | | | | | | | | | | | | | | | | | | | | This move allows drivers to add private structure elements to track the number of packets, bytes, and interrupts events per ring. A driver also defines a workqueue handler to act on this collected data once per poll and modify the coalescing parameters per ring. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5e: Change Mellanox references in DIM codeAndy Gospodarek2018-01-108-202/+226
| | | | | | | | | | | | | | | | | | | | | | | | Change all appropriate mlx5_am* and MLX5_AM* references to net_dim and NET_DIM, respectively, in code that handles dynamic interrupt moderation. Also change all references from 'am' to 'dim' when used as local variables and add generic profile references. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5e: Move generic functions to new fileAndy Gospodarek2018-01-105-20/+55
| | | | | | | | | | | | | | | | | | | | | | These functions were identified as ones that could be made generic and used by multiple drivers. Most of the contents of en_rx_am.c are moved to net_dim.c. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5e: Move AM logic enumsAndy Gospodarek2018-01-102-25/+26
| | | | | | | | | | | | | | | | | | More movement to help make this code more generic. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5e: Remove rq references in mlx5e_rx_amAndy Gospodarek2018-01-103-12/+21
| | | | | | | | | | | | | | | | | | | | | | This makes mlx5e_am_sample more generic so that it can be called easily from a driver that does not use the same data structure to store these values in a single structure. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5e: Move interrupt moderation forward declarationsAndy Gospodarek2018-01-102-4/+5
| | | | | | | | | | | | | | | | | | | | Move these to newly created file to prepare to move these functions to a library. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5e: Move interrupt moderation structs to new fileAndy Gospodarek2018-01-102-32/+70
|/ | | | | | | | | | Create new header file to prepare to move code that handles irq moderation to a library that lives in a header file. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'ipv6-Add-support-for-non-equal-cost-multipath'David S. Miller2018-01-104-32/+126
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ido Schimmel says: ==================== ipv6: Add support for non-equal-cost multipath This set aims to add support for IPv6 non-equal-cost multipath routes. The first three patches convert multipath selection to use the hash-threshold method (RFC 2992) instead of modulo-N. The same method is employed by the IPv4 routing code since commit 0e884c78ee19 ("ipv4: L3 hash-based multipath"). Unlike modulo-N, with hash-threshold only the flows near the region boundaries are affected when a nexthop is added or removed. In addition, it allows us to easily add support for non-equal-cost multipath in the last patch by sizing the different regions according to the provided weights. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv6: Add support for non-equal-cost multipathIdo Schimmel2018-01-102-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | The use of hash-threshold instead of modulo-N makes it trivial to add support for non-equal-cost multipath. Instead of dividing the multipath hash function's output space equally between the nexthops, each nexthop is assigned a region size which is proportional to its weight. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv6: Use hash-threshold instead of modulo-NIdo Schimmel2018-01-101-23/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that each nexthop stores its region boundary in the multipath hash function's output space, we can use hash-threshold instead of modulo-N in multipath selection. This reduces the number of checks we need to perform during lookup, as dead and linkdown nexthops are assigned a negative region boundary. In addition, in contrast to modulo-N, only flows near region boundaries are affected when a nexthop is added or removed. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv6: Use a 31-bit multipath hashIdo Schimmel2018-01-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | The hash thresholds assigned to IPv6 nexthops are in the range of [-1, 2^31 - 1], where a negative value is assigned to nexthops that should not be considered during multipath selection. Therefore, in a similar fashion to IPv4, we need to use the upper 31-bits of the multipath hash for multipath selection. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv6: Calculate hash thresholds for IPv6 nexthopsIdo Schimmel2018-01-104-6/+106
|/ | | | | | | | | | | | | | | | | Before we convert IPv6 to use hash-threshold instead of modulo-N, we first need each nexthop to store its region boundary in the hash function's output space. The boundary is calculated by dividing the output space equally between the different active nexthops. That is, nexthops that are not dead or linkdown. The boundaries are rebalanced whenever a nexthop is added or removed to a multipath route and whenever a nexthop becomes active or inactive. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* vhost_net: batch used ring update in rxJason Wang2018-01-101-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch tries to batched used ring update during RX. This is pretty fit for the case when guest is much faster (e.g dpdk based backend). In this case, used ring is almost empty: - we may get serious cache line misses/contending on both used ring and used idx. - at most 1 packet could be dequeued at one time, batching in guest does not make much effect. Update used ring in a batch can help since guest won't access the used ring until used idx was advanced for several descriptors and since we advance used ring for every N packets, guest will only need to access used idx for every N packet since it can cache the used idx. To have a better interaction for both batch dequeuing and dpdk batching, VHOST_RX_BATCH was used as the maximum number of descriptors that could be batched. Test were done between two machines with 2.40GHz Intel(R) Xeon(R) CPU E5-2630 connected back to back through ixgbe. Traffic were generated on one remote ixgbe through MoonGen and measure the RX pps through testpmd in guest when do xdp_redirect_map from local ixgbe to tap. RX pps were increased from 3.05 Mpps to 4.00 Mpps (about 31% improvement). One possible concern for this is the implications for TCP (especially latency sensitive workload). Result[1] does not show obvious changes for most of the netperf test (RR, TX, and RX). And we do get some improvements for RX on some specific size. Guest RX: size/sessions/+thu%/+normalize% 64/ 1/ +2%/ +2% 64/ 2/ +2%/ -1% 64/ 4/ +1%/ +1% 64/ 8/ 0%/ 0% 256/ 1/ +6%/ -3% 256/ 2/ -3%/ +2% 256/ 4/ +11%/ +11% 256/ 8/ 0%/ 0% 512/ 1/ +4%/ 0% 512/ 2/ +2%/ +2% 512/ 4/ 0%/ -1% 512/ 8/ -8%/ -8% 1024/ 1/ -7%/ -17% 1024/ 2/ -8%/ -7% 1024/ 4/ +1%/ 0% 1024/ 8/ 0%/ 0% 2048/ 1/ +30%/ +14% 2048/ 2/ +46%/ +40% 2048/ 4/ 0%/ 0% 2048/ 8/ 0%/ 0% 4096/ 1/ +23%/ +22% 4096/ 2/ +26%/ +23% 4096/ 4/ 0%/ +1% 4096/ 8/ 0%/ 0% 16384/ 1/ -2%/ -3% 16384/ 2/ +1%/ -4% 16384/ 4/ -1%/ -3% 16384/ 8/ 0%/ -1% 65535/ 1/ +15%/ +7% 65535/ 2/ +4%/ +7% 65535/ 4/ 0%/ +1% 65535/ 8/ 0%/ 0% TCP_RR: size/sessions/+thu%/+normalize% 1/ 1/ 0%/ +1% 1/ 25/ +2%/ +1% 1/ 50/ +4%/ +1% 64/ 1/ 0%/ -4% 64/ 25/ +2%/ +1% 64/ 50/ 0%/ -1% 256/ 1/ 0%/ 0% 256/ 25/ 0%/ 0% 256/ 50/ +4%/ +2% Guest TX: size/sessions/+thu%/+normalize% 64/ 1/ +4%/ -2% 64/ 2/ -6%/ -5% 64/ 4/ +3%/ +6% 64/ 8/ 0%/ +3% 256/ 1/ +15%/ +16% 256/ 2/ +11%/ +12% 256/ 4/ +1%/ 0% 256/ 8/ +5%/ +5% 512/ 1/ -1%/ -6% 512/ 2/ 0%/ -8% 512/ 4/ -2%/ +4% 512/ 8/ +6%/ +9% 1024/ 1/ +3%/ +1% 1024/ 2/ +3%/ +9% 1024/ 4/ 0%/ +7% 1024/ 8/ 0%/ +7% 2048/ 1/ +8%/ +2% 2048/ 2/ +3%/ -1% 2048/ 4/ -1%/ +11% 2048/ 8/ +3%/ +9% 4096/ 1/ +8%/ +8% 4096/ 2/ 0%/ -7% 4096/ 4/ +4%/ +4% 4096/ 8/ +2%/ +5% 16384/ 1/ -3%/ +1% 16384/ 2/ -1%/ -12% 16384/ 4/ -1%/ +5% 16384/ 8/ 0%/ +1% 65535/ 1/ 0%/ -3% 65535/ 2/ +5%/ +16% 65535/ 4/ +1%/ +2% 65535/ 8/ +1%/ -1% Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge tag 'mlx5-updates-2018-01-08' of ↵David S. Miller2018-01-1017-27/+649
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2018-01-08 Four patches from Or that add Hairpin support to mlx5: =========================================================== From: Or Gerlitz <ogerlitz@mellanox.com> We refer the ability of NIC HW to fwd packet received on one port to the other port (also from a port to itself) as hairpin. The application API is based on ingress tc/flower rules set on the NIC with the mirred redirect action. Other actions can apply to packets during the redirect. Hairpin allows to offload the data-path of various SW DDoS gateways, load-balancers, etc to HW. Packets go through all the required processing in HW (header re-write, encap/decap, push/pop vlan) and then forwarded, CPU stays at practically zero usage. HW Flow counters are used by the control plane for monitoring and accounting. Hairpin is implemented by pairing a receive queue (RQ) to send queue (SQ). All the flows that share <recv NIC, mirred NIC> are redirected through the same hairpin pair. Currently, only header-rewrite is supported as a packet modification action. I'd like to thanks Elijah Shakkour <elijahs@mellanox.com> for implementing this functionality on HW simulator, before it was avail in the FW so the driver code could be tested early. =========================================================== From Feras three patches that provide very small changes that allow IPoIB to support RX timestamping for child interfaces, simply by hooking the mlx5e timestamping PTP ioctl to IPoIB child interface netdev profile. One patch from Gal to fix a spilling mistake. Two patches from Eugenia adds drop counters to VF statistics to be reported as part of VF statistics in netlink (iproute2) and implemented them in mlx5 eswitch. Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5e: E-switch, Add steering drop countersEugenia Emantayev2018-01-094-2/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add flow counters to count packets dropped due to drop rules configured in eswitch egress and ingress ACLs. These counters will count VFs violations and incoming traffic drops. Will be presented on hypervisor via standard 'ip -s link show' command. Example: "ip -s link show dev enp5s0f0" 6: enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 24:8a:07:a5:28:f0 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 0 0 0 0 0 2 TX: bytes packets errors dropped carrier collsns 1406 17 0 0 0 0 vf 0 MAC 00:00:ca:fe:ca:fe, vlan 5, spoof checking off, link-state auto, trust off, query_rss off RX: bytes packets mcast bcast dropped 1666 29 14 32 0 TX: bytes packets dropped 2880 44 2412 Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/core: Add drop counters to VF statisticsEugenia Emantayev2018-01-093-1/+13
| | | | | | | | | | | | | | | | | | Modern hardware can decide to drop packets going to/from a VF. Add receive and transmit drop counters to be displayed at hypervisor layer in iproute2 per VF statistics. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5e: IPoIB, Fix spelling mistake "functionts" -> "functions"Gal Pressman2018-01-091-1/+1
| | | | | | | | | | | | | | Fix trivial spelling mistake: "functionts" -> "functions". Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5e: IPoIB, Add ethtool support to get child time stamping parametersFeras Daoud2018-01-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Add support to get time stamping capabilities using ethtool for child interface. Usage example: ethtool -T CHILD-DEVNAME This change reuses the functionality of parent devices and does not introduce any new logic. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5e: IPoIB, Add PTP ioctl support for child interfaceFeras Daoud2018-01-093-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | Add support to control precision time protocol on child interfaces using ioctl. This commit changes the following: - Change parent ioctl function to be non static - Reuse the parent ioctl function in child devices Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5e: IPoIB, Use correct timestamp in child receive flowFeras Daoud2018-01-091-1/+6
| | | | | | | | | | | | | | | | | | | | The current implementation takes the child timestamp object from the parent since the rq in mlx5i_complete_rx_cqe belongs to the parent. This change fixes the issue by taking the correct timestamp. Fixes: 7e7f4780c340 ("net/mlx5e: IPoIB, Use hash-table to map between QPN to child netdev") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5e: Support offloading TC NIC hairpin flowsOr Gerlitz2018-01-092-12/+172
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We refer to TC NIC rule that involves forwarding as "hairpin". All hairpin rules from the current NIC device (called "func" in the code) to a given NIC device ("peer") are steered into the same hairpin RQ/SQ pair. The hairpin pair is set on demand and removed when there are no TC rules that need it. Here's a TC rule that matches on icmp, does header re-write of the dst mac and hairpin from RX/enp1s2f1 to TX/enp1s2f2 (enp1s2f1/2 are two mlx5 devices): tc filter add dev enp1s2f1 protocol ip parent ffff: prio 2 flower skip_sw ip_proto icmp action pedit ex munge eth dst set 10:22:33:44:55:66 pipe action mirred egress redirect dev enp1s2f2 Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5e: Basic setup of hairpin objectOr Gerlitz2018-01-091-0/+97
| | | | | | | | | | | | | | | | | | | | | | | | Add the code to do basic setup for hairpin object which will later serve offloading TC flows. This includes calling the mlx5 core to create/destroy the hairpin pair object and setting the HW transport objects that will be used for steering matched flows to go through hairpin. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5: Hairpin pair core object setupOr Gerlitz2018-01-092-0/+203
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Low level code to setup hairpin pair core object, deals with: - create hairpin RQs/SQs - destroy hairpin RQs/SQs - modifying hairpin RQs/SQs - pairing (rst2rdy) and unpairing (rdy2rst) Unlike conventional RQs/SQs, the memory used for the packet and descriptor buffers is allocated by the firmware and not the driver. The driver sets the overall data size (log). Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5: Add hairpin definitions to the FW APIOr Gerlitz2018-01-091-8/+35
| | | | | | | | | | | | | | | | | | | | | | Add hairpin definitions to the IFC file. This includes the HCA ID, few HCA hairpin capabilities, new fields in RQ/SQ used later for the pairing and the WQ hairpin data size attribute. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | Merge branch 'hns3-next'David S. Miller2018-01-102-81/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Peng Li says: ==================== code improvements in HNS3 driver This patchset fixes 2 comments for community review. [patch 1/2] reverts "net: hns3: Add packet statistics of netdev" reported by Jakub Kicinski and David Miller. [patch 2/2] reports the function type the same line with hns3_nic_get_stats64, reported by Andrew Lunn. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: hns3: report the function type the same line with hns3_nic_get_stats64Peng Li2018-01-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function type should be on the same line with the function name, or it may cause display error if a patch edit the function. There is am example following: https://www.spinics.net/lists/netdev/msg476141.html Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Revert "net: hns3: Add packet statistics of netdev"Peng Li2018-01-101-79/+1
|/ / | | | | | | | | | | | | | | | | This reverts commit 8491000754796c838a0081c267f9dd54ad2ccba3. It is duplicate to add statistics of netdev for ethtool -S. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'Socionext-Synquacer-NETSEC-driver'David S. Miller2018-01-105-0/+1848
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jassi Brar says: ==================== Socionext Synquacer NETSEC driver Changes since v5 # Removed helper macros # Removed 'inline' qualifier # Changed multiline empty comment to single line # Added 'clock-names' property in DT binding example # Ignore 'clock-names' property in driver until f/ws in the wild are upgraded or we support instance that take in more than one clock. # Rebased the patchset onto net-next Changes since v4 # Fixed ucode indexing as a word, instead of byte # Removed redundant clocks, keep only phy rate reference clock and expect it to be 'phy_ref_clk' Changes since v3 # Discard 'socionext,snq-mdio', and simply use 'mdio' subnode. # Use ioremap on ucode region as well, instead of memremap. Changes since v2 # Use 'mdio' subnode in DT bindings. # Use phy_interface_mode_is_rgmii(), instead of open coding the check. # Use readl/b with eeprom_base pointer. # Unregister mdio bus upon failure in probe. Changes since v1 # Switched from using memremap to ioremap # Implemented ndo_do_ioctl callback # Defined optional 'dma-coherent' DT property ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | MAINTAINERS: Add entry for Socionext ethernet driverJassi Brar2018-01-101-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | Add entry for the Socionext Netsec controller driver and DT bindings. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: socionext: Add Synquacer NetSec driverJassi Brar2018-01-103-0/+1788
| | | | | | | | | | | | | | | | | | | | | | | | | | | This driver adds support for Socionext "netsec" IP Gigabit Ethernet + PHY IP used in the Synquacer SC2A11 SoC. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | dt-bindings: net: Add DT bindings for Socionext NetsecJassi Brar2018-01-101-0/+53
|/ / | | | | | | | | | | | | | | | | | | This patch adds documentation for Device-Tree bindings for the Socionext NetSec Controller driver. Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch '10GbE' of ↵David S. Miller2018-01-1010-177/+298
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2018-01-09 This series contains updates to ixgbe and ixgbevf only. Emil fixes an issue with "wake on LAN"(WoL) where we need to ensure we enable the reception of multicast packets so that WoL works for IPv6 magic packets. Cleaned up code no longer needed with the update to adaptive ITR. Paul update the driver to advertise the highest capable link speed when a module gets inserted. Also extended the displaying of firmware version to include the iSCSI and OEM block in the EEPROM to better identify firmware versions/images. Tonghao Zhang cleans up a code comment that no longer applies since InterruptThrottleRate has been removed from the driver. Alex fixes SR-IOV and MACVLAN offload interaction, where the MACVLAN offload was incorrectly configuring several filters with the wrong pool value which resulted in MACLVAN interfaces not being able to receive traffic that had to pass over the physical interface. Fixed transmit hangs and dropped receive frames when the number of VFs changed. Added support for RSS on MACVLAN pools for X550 devices. Fixed up the MACVLAN limitations so we can now support 63 offloaded devices. Cleaned up MACVLAN code that is no longer needed with the recent changes and fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ixgbe: Drop l2_accel_priv data pointer from ring structAlexander Duyck2018-01-092-11/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The l2 acceleration private pointer isn't needed in the ring struct. It isn't really used anywhere other than to test and see if we are supporting an offloaded macvlan netdev, and it is much easier to test netdev for not being ixgbe based to verify that. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | ixgbe: Use ring values to test for Tx pendingAlexander Duyck2018-01-091-16/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch simplifies the check for Tx pending traffic and makes it more holistic as there being any difference between next_to_use and next_to_clean is much more informative than if head and tail are equal, as it is possible for us to either not update tail, or not be notified of completed work in which case next_to_clean would not be equal to head. In addition the simplification makes it so that we don't have to read hardware which allows us to drop a number of variables that were previously being used in the call. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | ixgbe: Fix limitations on macvlan so we can support up to 63 offloaded devicesAlexander Duyck2018-01-094-43/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change is a fix of the macvlan offload so that we correctly handle macvlan offloaded devices. Specifically we were configuring our limits based on the assumption that we were going to max out the RSS indices for every mode. As a result when we went to 15 or more macvlan interfaces we were forced into the 2 queue RSS mode on VFs even though they could have still supported 4. This change splits the logic up so that we limit either the total number of macvlan instances if DCB is enabled, or limit the number of RSS queues used per macvlan (instead of per pool) if SR-IOV is enabled. By doing this we can make best use of the part. In addition I have increased the maximum number of supported interfaces to 63 with one queue per offloaded interface as this more closely reflects the actual values supported by the interface. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | ixgbe: There is no need to update num_rx_pools in L2 fwd offloadAlexander Duyck2018-01-092-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The num_rx_pools value is overwritten when we reinitialize the queue configuration. In reality we shouldn't need to be updating the value since it is redone every time we call into ixgbe_setup_tc so for now just drop the spots where we were incrementing or decrementing the value. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>