summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
Commit message (Collapse)AuthorAgeFilesLines
* cxgb4: Fix race condition in cleanupAnish Bhatt2014-08-211-14/+18
| | | | | | | | | | | | | | | There is a possible race condition when we unregister the PCI Driver and then flush/destroy the global "workq". This could lead to situations where there are tasks on the Work Queue with references to now deleted adapter data structures. Instead, have per-adapter Work Queues which were instantiated and torn down in init_one() and remove_one(), respectively. v2: Remove unnecessary call to flush_workqueue() before destroy_workqueue() Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge tag 'pci-v3.17-changes-2' of ↵Linus Torvalds2014-08-141-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull DEFINE_PCI_DEVICE_TABLE removal from Bjorn Helgaas: "Part two of the PCI changes for v3.17: - Remove DEFINE_PCI_DEVICE_TABLE macro use (Benoit Taine) It's a mechanical change that removes uses of the DEFINE_PCI_DEVICE_TABLE macro. I waited until later in the merge window to reduce conflicts, but it's possible you'll still see a few" * tag 'pci-v3.17-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Remove DEFINE_PCI_DEVICE_TABLE macro use
| * PCI: Remove DEFINE_PCI_DEVICE_TABLE macro useBenoit Taine2014-08-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should prefer `struct pci_device_id` over `DEFINE_PCI_DEVICE_TABLE` to meet kernel coding style guidelines. This issue was reported by checkpatch. A simplified version of the semantic patch that makes this change is as follows (http://coccinelle.lip6.fr/): // <smpl> @@ identifier i; declarer name DEFINE_PCI_DEVICE_TABLE; initializer z; @@ - DEFINE_PCI_DEVICE_TABLE(i) + const struct pci_device_id i[] = z; // </smpl> [bhelgaas: add semantic patch] Signed-off-by: Benoit Taine <benoit.taine@lip6.fr> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* | cxgb4: IEEE fixes for DCBx state machineAnish Bhatt2014-08-071-0/+2
|/ | | | | | | | | | | | | | | | | | | * Changes required due to 16eecd9be4b05 ("dcbnl : Fix misleading dcb_app->priority explanation") * Driver was previously not aware of what DCBx version was negotiated by firmware, this could lead to DCB app table in kernel or in firmware being populated wrong since IEEE/CEE used different formats made clear by above mentioned commit * Driver was missing a couple of state transitions that could be caused by other drivers that use chelsio hardware, resulting in incorrect behaviour (the change that addresses this also flips the state machine to switch on state instead of transition, hope this is okay in current window) * Prio queue info & tsa is no longer thrown away v2: Print DCBx state transition messages only when debug is enabled Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Fix for SR-IOV VF initializationHariprasad Shenai2014-08-061-4/+3
| | | | | | | | | | | | | | | Commit 35b1de5 ("rdma/cxgb4: Fixes cxgb4 probe failure in VM when PF is exposed through PCI Passthrough") introduced a regression, where VF failed to initialize for Physical function 0 to Physical Function 3. In the above commit, we removed the code which used to enable sriov for PF0 to PF3. Now adding it back to get sriov working. V2: Removed SRIOV loop for PF[0..3] to instantiate the VF's as per David Miller's comment Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4 : Disable recursive mailbox commands when enabling viAnish Bhatt2014-08-051-1/+4
| | | | | | | | | | | | Enabling a Virtual Interface can result in an interrupt during the processing of the VI Enable command and, in some paths, result in an attempt to issue another command in the interrupt context, eventually crashing the system. Thus, we disable interrupts during the course of the VI Enable command and ensure enable doesn't sleep. Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Export symbols required by cxgb4i for ipv6 support and required definesAnish Bhatt2014-07-171-4/+6
| | | | | Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4/iw_cxgb4: work request logging featureHariprasad Shenai2014-07-151-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit enhances the iwarp driver to optionally keep a log of rdma work request timining data for kernel mode QPs. If iw_cxgb4 module option c4iw_wr_log is set to non-zero, each work request is tracked and timing data maintained in a rolling log that is 4096 entries deep by default. Module option c4iw_wr_log_size_order allows specifing a log2 size to use instead of the default order of 12 (4096 entries). Both module options are read-only and must be passed in at module load time to set them. IE: modprobe iw_cxgb4 c4iw_wr_log=1 c4iw_wr_log_size_order=10 The timing data is viewable via the iw_cxgb4 debugfs file "wr_log". Writing anything to this file will clear all the timing data. Data tracked includes: - The host time when the work request was posted, just before ringing the doorbell. The host time when the completion was polled by the application. This is also the time the log entry is created. The delta of these two times is the amount of time took processing the work request. - The qid of the EQ used to post the work request. - The work request opcode. - The cqe wr_id field. For sq completions requests this is the swsqe index. For recv completions this is the MSN of the ingress SEND. This value can be used to match log entries from this log with firmware flowc event entries. - The sge timestamp value just before ringing the doorbell when posting, the sge timestamp value just after polling the completion, and CQE.timestamp field from the completion itself. With these three timestamps we can track the latency from post to poll, and the amount of time the completion resided in the CQ before being reaped by the application. With debug firmware, the sge timestamp is also logged by firmware in its flowc history so that we can compute the latency from posting the work request until the firmware sees it. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4/iw_cxgb4: display TPTE on errorsHariprasad Shenai2014-07-151-0/+66
| | | | | | | | | | | With ingress WRITE or READ RESPONSE errors, HW provides the offending stag from the packet. This patch adds logic to log the parsed TPTE in this case. cxgb4 now exports a function to read a TPTE entry from adapter memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4/iw_cxgb4: use firmware ord/ird resource limitsHariprasad Shenai2014-07-151-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Advertise a larger max read queue depth for qps, and gather the resource limits from fw and use them to avoid exhaustinq all the resources. Design: cxgb4: Obtain the max_ordird_qp and max_ird_adapter device params from FW at init time and pass them up to the ULDs when they attach. If these parameters are not available, due to older firmware, then hard-code the values based on the known values for older firmware. iw_cxgb4: Fix the c4iw_query_device() to report these correct values based on adapter parameters. ibv_query_device() will always return: max_qp_rd_atom = max_qp_init_rd_atom = min(module_max, max_ordird_qp) max_res_rd_atom = max_ird_adapter Bump up the per qp max module option to 32, allowing it to be increased by the user up to the device max of max_ordird_qp. 32 seems to be sufficient to maximize throughput for streaming read benchmarks. Fail connection setup if the negotiated IRD exhausts the available adapter ird resources. So the driver will track the amount of ird resource in use and not send an RI_WR/INIT to FW that would reduce the available ird resources below zero. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* iw_cxgb4: Detect Ing. Padding Boundary at run-timeHariprasad Shenai2014-07-151-0/+2
| | | | | | | | | Updates iw_cxgb4 to determine the Ingress Padding Boundary from cxgb4_lld_info, and take subsequent actions. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c: remove unnecessary null ↵Fabian Frederick2014-07-021-2/+1
| | | | | | | | | | | | test before debugfs_remove_recursive Fix checkpatch warning: "WARNING: debugfs_remove_recursive(NULL) is safe this check is probably not required" Cc: Hariprasad S <hariprasad@chelsio.com> Cc: netdev@vger.kernel.org Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Adds device ID for few more Chelsio T4 AdaptersHariprasad Shenai2014-07-011-0/+20
| | | | | Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Replaced the backdoor mechanism to access the HW memory with PCIe ↵Hariprasad Shenai2014-07-011-27/+31
| | | | | | | | | | | | | | | | | | Window method Rip out a bunch of redundant PCI-E Memory Window Read/Write routines, collapse the more general purpose routines into a single routine thereby eliminating the need for a large stack frame (and extra data copying) in the outer routine, change everything to use the improved routine t4_memory_rw. Based on origninal work by Casey Leedom <leedom@chelsio.com> and Steve Wise <swise@opengridcomputing.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Use FW interface to get BAR0 valueHariprasad Shenai2014-07-011-7/+65
| | | | | | | | | | | | | | | | | | | | Use the firmware interface to get the BAR0 value since we really don't want to use the PCI-E Configuration Space Backdoor access which is owned by the firmware. Set up PCI-E Memory Window registers using the true values programmed into BAR registers. When the PF4 "Master Function" is exported to a Virtual Machine, the values returned by pci_resource_start() will be for the synthetic PCI-E Configuration Space and not the real addresses. But we need to program the PCI-E Memory Window address decoders with the real addresses that we're going to be using in order to have accesses through the Memory Windows work. Based on origninal work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rdma/cxgb4: Fixes cxgb4 probe failure in VM when PF is exposed through PCI ↵Hariprasad Shenai2014-07-011-8/+10
| | | | | | | | | | | | | | | | | | Passthrough Change logic which determines our Physical Function at PCI Probe time. Now we read the PL_WHOAMI register and get the Physical Function. Pass Physical Function to Upper Layer Drivers in lld_info structure in the new field "pf" added to lld_info. This is useful for the cases where the PF, say PF4, is attached to a Virtual Machine via some form of "PCI Pass Through" technology and the PCI Function shows up as PF0 in the VM. Based on original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-06-251-9/+7
|\
| * cxgb4: Not need to hold the adap_rcu_lock lock when read adap_rcu_listLi RongQing2014-06-241-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cxgb4_netdev maybe lead to dead lock, since it uses a spin lock, and be called in both thread and softirq context, but not disable BH, the lockdep report is below; In fact, cxgb4_netdev only reads adap_rcu_list with RCU protection, so not need to hold spin lock again. ================================= [ INFO: inconsistent lock state ] 3.14.7+ #24 Tainted: G C O --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. radvd/3794 [HC0[0]:SC1[1]:HE1:SE0] takes: (adap_rcu_lock){+.?...}, at: [<ffffffffa09989ea>] clip_add+0x2c/0x116 [cxgb4] {SOFTIRQ-ON-W} state was registered at: [<ffffffff810fca81>] __lock_acquire+0x34a/0xe48 [<ffffffff810fd98b>] lock_acquire+0x82/0x9d [<ffffffff815d6ff8>] _raw_spin_lock+0x34/0x43 [<ffffffffa09989ea>] clip_add+0x2c/0x116 [cxgb4] [<ffffffffa0998beb>] cxgb4_inet6addr_handler+0x117/0x12c [cxgb4] [<ffffffff815da98b>] notifier_call_chain+0x32/0x5c [<ffffffff815da9f9>] __atomic_notifier_call_chain+0x44/0x6e [<ffffffff815daa32>] atomic_notifier_call_chain+0xf/0x11 [<ffffffff815b1356>] inet6addr_notifier_call_chain+0x16/0x18 [<ffffffffa01f72e5>] ipv6_add_addr+0x404/0x46e [ipv6] [<ffffffffa01f8df0>] addrconf_add_linklocal+0x5f/0x95 [ipv6] [<ffffffffa01fc3e9>] addrconf_notify+0x632/0x841 [ipv6] [<ffffffff815da98b>] notifier_call_chain+0x32/0x5c [<ffffffff810e09a1>] __raw_notifier_call_chain+0x9/0xb [<ffffffff810e09b2>] raw_notifier_call_chain+0xf/0x11 [<ffffffff8151b3b7>] call_netdevice_notifiers_info+0x4e/0x56 [<ffffffff8151b3d0>] call_netdevice_notifiers+0x11/0x13 [<ffffffff8151c0a6>] netdev_state_change+0x1f/0x38 [<ffffffff8152f004>] linkwatch_do_dev+0x3b/0x49 [<ffffffff8152f184>] __linkwatch_run_queue+0x10b/0x144 [<ffffffff8152f1dd>] linkwatch_event+0x20/0x27 [<ffffffff810d7bc0>] process_one_work+0x1cb/0x2ee [<ffffffff810d7e3b>] worker_thread+0x12e/0x1fc [<ffffffff810dd391>] kthread+0xc4/0xcc [<ffffffff815dc48c>] ret_from_fork+0x7c/0xb0 irq event stamp: 3388 hardirqs last enabled at (3388): [<ffffffff810c6c85>] __local_bh_enable_ip+0xaa/0xd9 hardirqs last disabled at (3387): [<ffffffff810c6c2d>] __local_bh_enable_ip+0x52/0xd9 softirqs last enabled at (3288): [<ffffffffa01f1d5b>] rcu_read_unlock_bh+0x0/0x2f [ipv6] softirqs last disabled at (3289): [<ffffffff815ddafc>] do_softirq_own_stack+0x1c/0x30 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(adap_rcu_lock); <Interrupt> lock(adap_rcu_lock); *** DEADLOCK *** 5 locks held by radvd/3794: #0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffffa020b85a>] rawv6_sendmsg+0x74b/0xa4d [ipv6] #1: (rcu_read_lock){.+.+..}, at: [<ffffffff8151ac6b>] rcu_lock_acquire+0x0/0x29 #2: (rcu_read_lock){.+.+..}, at: [<ffffffffa01f4cca>] rcu_lock_acquire.constprop.16+0x0/0x30 [ipv6] #3: (rcu_read_lock){.+.+..}, at: [<ffffffff810e09b4>] rcu_lock_acquire+0x0/0x29 #4: (rcu_read_lock){.+.+..}, at: [<ffffffffa0998782>] rcu_lock_acquire.constprop.40+0x0/0x30 [cxgb4] stack backtrace: CPU: 7 PID: 3794 Comm: radvd Tainted: G C O 3.14.7+ #24 Hardware name: Supermicro X7DBU/X7DBU, BIOS 6.00 12/03/2007 ffffffff81f15990 ffff88012fdc36a8 ffffffff815d0016 0000000000000006 ffff8800c80dc2a0 ffff88012fdc3708 ffffffff815cc727 0000000000000001 0000000000000001 ffff880100000000 ffffffff81015b02 ffff8800c80dcb58 Call Trace: <IRQ> [<ffffffff815d0016>] dump_stack+0x4e/0x71 [<ffffffff815cc727>] print_usage_bug+0x1ec/0x1fd [<ffffffff81015b02>] ? save_stack_trace+0x27/0x44 [<ffffffff810fbfaa>] ? check_usage_backwards+0xa0/0xa0 [<ffffffff810fc640>] mark_lock+0x11b/0x212 [<ffffffff810fca0b>] __lock_acquire+0x2d4/0xe48 [<ffffffff810fbfaa>] ? check_usage_backwards+0xa0/0xa0 [<ffffffff810fbff6>] ? check_usage_forwards+0x4c/0xa6 [<ffffffff810c6c8a>] ? __local_bh_enable_ip+0xaf/0xd9 [<ffffffff810fd98b>] lock_acquire+0x82/0x9d [<ffffffffa09989ea>] ? clip_add+0x2c/0x116 [cxgb4] [<ffffffffa0998782>] ? rcu_read_unlock+0x23/0x23 [cxgb4] [<ffffffff815d6ff8>] _raw_spin_lock+0x34/0x43 [<ffffffffa09989ea>] ? clip_add+0x2c/0x116 [cxgb4] [<ffffffffa09987b0>] ? rcu_lock_acquire.constprop.40+0x2e/0x30 [cxgb4] [<ffffffffa0998782>] ? rcu_read_unlock+0x23/0x23 [cxgb4] [<ffffffffa09989ea>] clip_add+0x2c/0x116 [cxgb4] [<ffffffffa0998beb>] cxgb4_inet6addr_handler+0x117/0x12c [cxgb4] [<ffffffff810fd99d>] ? lock_acquire+0x94/0x9d [<ffffffff810e09b4>] ? raw_notifier_call_chain+0x11/0x11 [<ffffffff815da98b>] notifier_call_chain+0x32/0x5c [<ffffffff815da9f9>] __atomic_notifier_call_chain+0x44/0x6e [<ffffffff815daa32>] atomic_notifier_call_chain+0xf/0x11 [<ffffffff815b1356>] inet6addr_notifier_call_chain+0x16/0x18 [<ffffffffa01f72e5>] ipv6_add_addr+0x404/0x46e [ipv6] [<ffffffff810fde6a>] ? trace_hardirqs_on+0xd/0xf [<ffffffffa01fb634>] addrconf_prefix_rcv+0x385/0x6ea [ipv6] [<ffffffffa0207950>] ndisc_rcv+0x9d3/0xd76 [ipv6] [<ffffffffa020d536>] icmpv6_rcv+0x592/0x67b [ipv6] [<ffffffff810c6c85>] ? __local_bh_enable_ip+0xaa/0xd9 [<ffffffff810c6c85>] ? __local_bh_enable_ip+0xaa/0xd9 [<ffffffff810fd8dc>] ? lock_release+0x14e/0x17b [<ffffffffa020df97>] ? rcu_read_unlock+0x21/0x23 [ipv6] [<ffffffff8150df52>] ? rcu_read_unlock+0x23/0x23 [<ffffffffa01f4ede>] ip6_input_finish+0x1e4/0x2fc [ipv6] [<ffffffffa01f540b>] ip6_input+0x33/0x38 [ipv6] [<ffffffffa01f5557>] ip6_mc_input+0x147/0x160 [ipv6] [<ffffffffa01f4ba3>] ip6_rcv_finish+0x7c/0x81 [ipv6] [<ffffffffa01f5397>] ipv6_rcv+0x3a1/0x3e2 [ipv6] [<ffffffff8151ef96>] __netif_receive_skb_core+0x4ab/0x511 [<ffffffff810fdc94>] ? mark_held_locks+0x71/0x99 [<ffffffff8151f0c0>] ? process_backlog+0x69/0x15e [<ffffffff8151f045>] __netif_receive_skb+0x49/0x5b [<ffffffff8151f0cf>] process_backlog+0x78/0x15e [<ffffffff8151f571>] ? net_rx_action+0x1a2/0x1cc [<ffffffff8151f47b>] net_rx_action+0xac/0x1cc [<ffffffff810c69b7>] ? __do_softirq+0xad/0x218 [<ffffffff810c69ff>] __do_softirq+0xf5/0x218 [<ffffffff815ddafc>] do_softirq_own_stack+0x1c/0x30 <EOI> [<ffffffff810c6bb6>] do_softirq+0x38/0x5d [<ffffffffa01f1d5b>] ? ip6_copy_metadata+0x156/0x156 [ipv6] [<ffffffff810c6c78>] __local_bh_enable_ip+0x9d/0xd9 [<ffffffffa01f1d88>] rcu_read_unlock_bh+0x2d/0x2f [ipv6] [<ffffffffa01f28b4>] ip6_finish_output2+0x381/0x3d8 [ipv6] [<ffffffffa01f49ef>] ip6_finish_output+0x6e/0x73 [ipv6] [<ffffffffa01f4a70>] ip6_output+0x7c/0xa8 [ipv6] [<ffffffff815b1bfa>] dst_output+0x18/0x1c [<ffffffff815b1c9e>] ip6_local_out+0x1c/0x21 [<ffffffffa01f2489>] ip6_push_pending_frames+0x37d/0x427 [ipv6] [<ffffffff81558af8>] ? skb_orphan+0x39/0x39 [<ffffffffa020b85a>] ? rawv6_sendmsg+0x74b/0xa4d [ipv6] [<ffffffffa020ba51>] rawv6_sendmsg+0x942/0xa4d [ipv6] [<ffffffff81584cd2>] inet_sendmsg+0x3d/0x66 [<ffffffff81508930>] __sock_sendmsg_nosec+0x25/0x27 [<ffffffff8150b0d7>] sock_sendmsg+0x5a/0x7b [<ffffffff810fd8dc>] ? lock_release+0x14e/0x17b [<ffffffff8116d756>] ? might_fault+0x9e/0xa5 [<ffffffff8116d70d>] ? might_fault+0x55/0xa5 [<ffffffff81508cb1>] ? copy_from_user+0x2a/0x2c [<ffffffff8150b70c>] ___sys_sendmsg+0x226/0x2d9 [<ffffffff810fcd25>] ? __lock_acquire+0x5ee/0xe48 [<ffffffff810fde01>] ? trace_hardirqs_on_caller+0x145/0x1a1 [<ffffffff8118efcb>] ? slab_free_hook.isra.71+0x50/0x59 [<ffffffff8115c81f>] ? release_pages+0xbc/0x181 [<ffffffff810fd99d>] ? lock_acquire+0x94/0x9d [<ffffffff81115e97>] ? read_seqcount_begin.constprop.25+0x73/0x90 [<ffffffff8150c408>] __sys_sendmsg+0x3d/0x5b [<ffffffff8150c433>] SyS_sendmsg+0xd/0x19 [<ffffffff815dc53d>] system_call_fastpath+0x1a/0x1f Reported-by: Ben Greear <greearb@candelatech.com> Cc: Casey Leedom <leedom@chelsio.com> Cc: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4 : Update copyright year on all cxgb4 filesAnish Bhatt2014-06-221-1/+1
| | | | | | | | | | Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4 : Integrate DCBx support into cxgb4 module. Register dbcnl_ops to give ↵Anish Bhatt2014-06-221-6/+193
|/ | | | | | | access to DCBx functions Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Change default Interrupt Holdoff Packet Count ThresholdHariprasad Shenai2014-06-101-29/+40
| | | | | | | | Based on original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* iw_cxgb4: Choose appropriate hw mtu index and ISS for iWARP connectionsHariprasad Shenai2014-06-101-4/+103
| | | | | | | | | | | | | | | | | | | | | | | Select the appropriate hw mtu index and initial sequence number to optimize hw memory performance. Add new cxgb4_best_aligned_mtu() which allows callers to provide enough information to be used to [possibly] select an MTU which will result in the TCP Data Segment Size (AKA Maximum Segment Size) to be an aligned value. If an RTR message exhange is required, then align the ISS to 8B - 1 + 4, so that after the SYN the send seqno will align on a 4B boundary. The RTR message exchange will leave the send seqno aligned on an 8B boundary. If an RTR is not required, then align the ISS to 8B - 1. The goal is to have the send seqno be 8B aligned when we send the first FPDU. Based on original work by Casey Leedom <leeedom@chelsio.com> and Steve Wise <swise@opengridcomputing.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* iw_cxgb4: Allocate and use IQs specifically for indirect interruptsHariprasad Shenai2014-06-101-5/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently indirect interrupts for RDMA CQs funnel through the LLD's RDMA RXQs, which also handle direct interrupts for offload CPLs during RDMA connection setup/teardown. The intended T4 usage model, however, is to have indirect interrupts flow through dedicated IQs. IE not to mix indirect interrupts with CPL messages in an IQ. This patch adds the concept of RDMA concentrator IQs, or CIQs, setup and maintained by the LLD and exported to iw_cxgb4 for use when creating CQs. RDMA CPLs will flow through the LLD's RDMA RXQs, and CQ interrupts flow through the CIQs. Design: cxgb4 creates and exports an array of CIQs for the RDMA ULD. These IQs are sized according to the max available CQs available at adapter init. In addition, these IQs don't need FL buffers since they only service indirect interrupts. One CIQ is setup per RX channel similar to the RDMA RXQs. iw_cxgb4 will utilize these CIQs based on the vector value passed into create_cq(). The num_comp_vectors advertised by iw_cxgb4 will be the number of CIQs configured, and thus the vector value will be the index into the array of CIQs. Based on original work by Steve Wise <swise@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: Replace ethtool_ops::{get,set}_rxfh_indir() with {get,set}_rxfh()Ben Hutchings2014-06-031-4/+4
| | | | | | | | | ETHTOOL_{G,S}RXFHINDIR and ETHTOOL_{G,S}RSSH should work for drivers regardless of whether they expose the hash key, unless you try to set a hash key for a driver that doesn't expose it. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* net: get rid of SET_ETHTOOL_OPSWilfried Klaebe2014-05-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | net: get rid of SET_ETHTOOL_OPS Dave Miller mentioned he'd like to see SET_ETHTOOL_OPS gone. This does that. Mostly done via coccinelle script: @@ struct ethtool_ops *ops; struct net_device *dev; @@ - SET_ETHTOOL_OPS(dev, ops); + dev->ethtool_ops = ops; Compile tested only, but I'd seriously wonder if this broke anything. Suggested-by: Dave Miller <davem@davemloft.net> Signed-off-by: Wilfried Klaebe <w-lkml@lebenslange-mailadresse.de> Acked-by: Felipe Balbi <balbi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* vlan: rename __vlan_find_dev_deep() to __vlan_find_dev_deep_rcu()dingtianhong2014-05-121-1/+1
| | | | | | | | | The __vlan_find_dev_deep should always called in RCU, according David's suggestion, rename to __vlan_find_dev_deep_rcu looks more reasonable. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-05-121-0/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/altera/altera_sgdma.c net/netlink/af_netlink.c net/sched/cls_api.c net/sched/sch_api.c The netlink conflict dealt with moving to netlink_capable() and netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations in non-init namespaces. These were simple transformations from netlink_capable to netlink_ns_capable. The Altera driver conflict was simply code removal overlapping some void pointer cast cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
| * cxgb4: Decode PCIe Gen3 link speedRoland Dreier2014-04-301-0/+2
| | | | | | | | | | | | | | Add handling for " 8 GT/s" in print_port_info(). Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4: Decode the firmware port and module type a bit more for ethtoolHariprasad Shenai2014-05-121-4/+11
|/ | | | | Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Adds device ID for few more Chelsio AdaptersHariprasad Shenai2014-03-281-0/+12
| | | | | Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* chelsio: Remove addressof casts to same typeJoe Perches2014-03-261-4/+4
| | | | | | | | | | | | | | | | | | | Using addressof then casting to the original type is pointless, so remove these unnecessary casts. Done via coccinelle script: $ cat typecast.cocci @@ type T; T foo; @@ - (T *)&foo + &foo Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug FixesSteve Wise2014-03-141-35/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current logic suffers from a slow response time to disable user DB usage, and also fails to avoid DB FIFO drops under heavy load. This commit fixes these deficiencies and makes the avoidance logic more optimal. This is done by more efficiently notifying the ULDs of potential DB problems, and implements a smoother flow control algorithm in iw_cxgb4, which is the ULD that puts the most load on the DB fifo. Design: cxgb4: Direct ULD callback from the DB FULL/DROP interrupt handler. This allows the ULD to stop doing user DB writes as quickly as possible. While user DB usage is disabled, the LLD will accumulate DB write events for its queues. Then once DB usage is reenabled, a single DB write is done for each queue with its accumulated write count. This reduces the load put on the DB fifo when reenabling. iw_cxgb4: Instead of marking each qp to indicate DB writes are disabled, we create a device-global status page that each user process maps. This allows iw_cxgb4 to only set this single bit to disable all DB writes for all user QPs vs traversing the idr of all the active QPs. If the libcxgb4 doesn't support this, then we fall back to the old approach of marking each QP. Thus we allow the new driver to work with an older libcxgb4. When the LLD upcalls iw_cxgb4 indicating DB FULL, we disable all DB writes via the status page and transition the DB state to STOPPED. As user processes see that DB writes are disabled, they call into iw_cxgb4 to submit their DB write events. Since the DB state is in STOPPED, the QP trying to write gets enqueued on a new DB "flow control" list. As subsequent DB writes are submitted for this flow controlled QP, the amount of writes are accumulated for each QP on the flow control list. So all the user QPs that are actively ringing the DB get put on this list and the number of writes they request are accumulated. When the LLD upcalls iw_cxgb4 indicating DB EMPTY, which is in a workq context, we change the DB state to FLOW_CONTROL, and begin resuming all the QPs that are on the flow control list. This logic runs on until the flow control list is empty or we exit FLOW_CONTROL mode (due to a DB DROP upcall, for example). QPs are removed from this list, and their accumulated DB write counts written to the DB FIFO. Sets of QPs, called chunks in the code, are removed at one time. The chunk size is 64. So 64 QPs are resumed at a time, and before the next chunk is resumed, the logic waits (blocks) for the DB FIFO to drain. This prevents resuming to quickly and overflowing the FIFO. Once the flow control list is empty, the db state transitions back to NORMAL and user QPs are again allowed to write directly to the user DB register. The algorithm is designed such that if the DB write load is high enough, then all the DB writes get submitted by the kernel using this flow controlled approach to avoid DB drops. As the load lightens though, we resume to normal DB writes directly by user applications. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-03-051-0/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/wireless/ath/ath9k/recv.c drivers/net/wireless/mwifiex/pcie.c net/ipv6/sit.c The SIT driver conflict consists of a bug fix being done by hand in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper was created (netdev_alloc_pcpu_stats()) which takes care of this. The two wireless conflicts were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/cxgb4: use remove handler as shutdown handlerThadeu Lima de Souza Cascardo2014-02-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without a shutdown handler, T4 cards behave very badly after a kexec. Some firmware calls return errors indicating allocation failures, for example. This is probably because thouse resources were not released by a BYE message to the firmware, for example. Using the remove handler guarantees we will use a well tested path. With this patch I applied, I managed to use kexec multiple times and probe and iSCSI login worked every time. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cgxb4: Stop using ethtool SPEED_* constantsBen Hutchings2014-02-241-10/+10
| | | | | | | | | | | | | | | | | | ethtool speed values are just numbers of megabits and there is no need to add SPEED_40000. To be consistent, use integer constants directly for all speeds. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4: Add more PCI device ids.Hariprasad Shenai2014-02-181-0/+4
| | | | | | | | | | Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4: Remove unused registers and add missing onesKumar Sanghvi2014-02-181-2/+2
| | | | | | | | | | | | | | | | Remove unused registers for registers list, and add missing ones Based on original work by Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4: Query firmware for T5 ULPTX MEMWRITE DSGL capabilitiesKumar Sanghvi2014-02-181-0/+16
| | | | | | | | | | | | | | | | | | | | | | Query firmware to see whether we're allowed to use T5 ULPTX MEMWRITE DSGL capabilities. Also pass that information to Upper Layer Drivers via the new (struct cxgb4_lld_info).ulptx_memwrite_dsgl boolean. Based on original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4: Allow >10G ports to have multiple queuesKumar Sanghvi2014-02-181-4/+5
| | | | | | | | | | | | | | Based on original work by Divy Le Ray. Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4: Print adapter VPD Part Number instead of Engineering Change fieldKumar Sanghvi2014-02-181-2/+2
| | | | | | | | | | | | | | | | When we attach to adapter, print VPD Part Number instead of Engineering Change field. Based on original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4: Add support to recognize 40G linksKumar Sanghvi2014-02-181-8/+16
| | | | | | | | | | | | | | | | | | | | | | | | Also, create a new Common Code interface to translate Firmware Port Technology Type values (enum fw_port_type) to string descriptions. This will allow us to maintain the description translation table in one place rather than in every driver. Based on original work by Scott Bardone and Casey Leedom <leedom@chelsio.com> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4: Use pci_enable_msix_range() instead of pci_enable_msix()Alexander Gordeev2014-02-181-26/+24
|/ | | | | | | | | | | | | | As result of deprecation of MSI-X/MSI enablement functions pci_enable_msix() and pci_enable_msi_block() all drivers using these two interfaces need to be updated to use the new pci_enable_msi_range() and pci_enable_msix_range() interfaces. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Cc: Dimitris Michailidis <dm@chelsio.com> Cc: netdev@vger.kernel.org Cc: linux-pci@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
* net/cxgb4: Fix referencing freed adapterGavin Shan2014-01-241-1/+1
| | | | | | | | | | | | The adapter is freed before we check its flags. It was caused by commit 144be3d ("net/cxgb4: Avoid disabling PCI device for towice"). The problem was reported by Intel's "0-day" tool. The patch fixes it to avoid reverting commit 144be3d. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/cxgb4: Don't retrieve stats during recoveryGavin Shan2014-01-231-0/+10
| | | | | | | | | | | | | We possibly retrieve the adapter's statistics during EEH recovery and that should be disallowed. Otherwise, it would possibly incur replicate EEH error and EEH recovery is going to fail eventually. The patch reuses statistics lock and checks net_device is attached before going to retrieve statistics, so that the problem can be avoided. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/cxgb4: Avoid disabling PCI device for towiceGavin Shan2014-01-231-5/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we have EEH error happens to the adapter and we have to remove it from the system for some reasons (e.g. more than 5 EEH errors detected from the device in last hour), the adapter will be disabled for towice separately by eeh_err_detected() and remove_one(), which will incur following unexpected backtrace. The patch tries to avoid it. WARNING: at drivers/pci/pci.c:1431 CPU: 12 PID: 121 Comm: eehd Not tainted 3.13.0-rc7+ #1 task: c0000001823a3780 ti: c00000018240c000 task.ti: c00000018240c000 NIP: c0000000003c1e40 LR: c0000000003c1e3c CTR: 0000000001764c5c REGS: c00000018240f470 TRAP: 0700 Not tainted (3.13.0-rc7+) MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 28000024 XER: 00000004 CFAR: c000000000706528 SOFTE: 1 GPR00: c0000000003c1e3c c00000018240f6f0 c0000000010fe1f8 0000000000000035 GPR04: 0000000000000000 0000000000000000 00000000003ae509 0000000000000000 GPR08: 000000000000346f 0000000000000000 0000000000000000 0000000000003fef GPR12: 0000000028000022 c00000000ec93000 c0000000000c11b0 c000000184ac3e40 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 c0000000009398d8 c00000000101f9c0 c0000001860ae000 GPR28: c000000182ba0000 00000000000001f0 c0000001860ae6f8 c0000001860ae000 NIP [c0000000003c1e40] .pci_disable_device+0xd0/0xf0 LR [c0000000003c1e3c] .pci_disable_device+0xcc/0xf0 Call Trace: [c0000000003c1e3c] .pci_disable_device+0xcc/0xf0 (unreliable) [d0000000073881c4] .remove_one+0x174/0x320 [cxgb4] [c0000000003c57e0] .pci_device_remove+0x60/0x100 [c00000000046396c] .__device_release_driver+0x9c/0x120 [c000000000463a20] .device_release_driver+0x30/0x60 [c0000000003bcdb4] .pci_stop_bus_device+0x94/0xd0 [c0000000003bcf48] .pci_stop_and_remove_bus_device+0x18/0x30 [c00000000003f548] .pcibios_remove_pci_devices+0xa8/0x140 [c000000000035c00] .eeh_handle_normal_event+0xa0/0x3c0 [c000000000035f50] .eeh_handle_event+0x30/0x2b0 [c0000000000362c4] .eeh_event_handler+0xf4/0x1b0 [c0000000000c12b8] .kthread+0x108/0x130 [c00000000000a168] .ret_from_kernel_thread+0x5c/0x74 Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Add API to correctly calculate tuple fieldsKumar Sanghvi2013-12-221-15/+5
| | | | | | | | | | | | | | | | | Adds API cxgb4_select_ntuple so as to enable Upper Level Drivers to correctly calculate the tuple fields. Adds constant definitions for TP_VLAN_PRI_MAP for the Compressed Filter Tuple field widths and structures and uses them. Also, the CPL Parameters field for T5 is 40 bits so we need to prototype cxgb4_select_ntuple() to calculate and return u64 values. Based on original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Account for stid entries properly in case of IPv6Kumar Sanghvi2013-12-221-2/+12
| | | | | | | | | | | | | | | | | IPv6 uses 2 TIDs with CLIP enabled and 4 TIDs without CLIP. Currently we are incrementing STIDs in use by 1 for both IPv4 and IPv6 which is wrong. Further, driver currently does not have interface to query if CLIP is programmed for particular IPv6 address. So, in this patch we increment/decrement TIDs in use by 4 for IPv6 assuming absence of CLIP. Such assumption keeps us on safe side and we don't end up allocating more stids for IPv6 than actually supported. Based on original work by Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Assign filter server TIDs properlyKumar Sanghvi2013-12-221-4/+12
| | | | | | | | | | | | | The LE workaround code is incorrectly reusing the TCAM TIDs (meant for allocation by firmware in case of hash collisions) for filter servers. This patch assigns the filter server TIDs properly starting from sftid_base index. Based on original work by Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Include TCP as protocol when creating server filtersKumar Sanghvi2013-12-221-0/+5
| | | | | | | | | | | | | | We were creating LE Workaround Server Filters without specifying IPPROTO_TCP (6) in the filters (when F_PROTOCOL is set in TP_VLAN_PRI_MAP). This meant that UDP packets with matching IP Addresses/Ports would get caught up in the filter and be delivered to ULDs like iw_cxgb4. So, include the protocol information in the server filter properly. Based on original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Reserve stid 0 for T4/T5 adaptersKumar Sanghvi2013-12-221-0/+6
| | | | | | | | | | | | | | | | When creating offload server entries, an IPv6 passive connection request can trigger a reply with a null STID, whereas the driver would expect the reply 'STID to match the value used for the request. This happens due to h/w limitation on T4 and T5. This patch ensures that STID 0 is never used if the stid range starts from zero. Based on original work by Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>