summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* IB/rdmavt: Add additional fields to post send traceMike Marciniszyn2017-04-052-4/+32
| | | | | | | | | | | | | | | | | | | | | | | | This fix is to get additional debugging information. The following fields are added: - wqe - qpt - num_sge - ssn - pid - send_flags These additional fields provide for more focused filtering and triggering. The patch also moves the trace to just before the wqe is posted to get the most accurate information and future proofs the code to trace all possible reserved opcodes. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/rdmavt, IB/hfi1, IB/qib: Make wc opcode translation driver dependentMike Marciniszyn2017-04-058-24/+55
| | | | | | | | | | | | | | | | The work to create a completion helper moved the translation of send wqe operations to completion opcodes to rdmvat. This precludes having driver dependent operations. Make the translation driver dependent by doing the translation in the driver prior to the rvt_qp_swqe_complete() call using restored translation tables. Fixes: Commit f2dc9cdce83c ("IB/rdmavt: Add a send completion helper") Fixes: Commit 0771da5a6e9d ("IB/hfi1,IB/qib: Use new send completion helper") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/hfi1: NULL pointer dereference when freeing rhashtableSebastian Sanchez2017-04-052-12/+28
| | | | | | | | | | | | | | | | | A NULL pointer dereference occurs when the driver is unloaded, and the SDMA rhashtable is freed if the rhashtable_init() function has not been called. Prevent this by changing sdma_rht to be a pointer to a dynamically allocated hash table. The NULL-ness of the pointer serves as an indication that the hash table was initialized and that it needs to be destroyed. Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity setup") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/hfi1: Cache registers during state changeMichael J. Ruhl2017-04-051-3/+55
| | | | | | | | | | | | When the LCB is going offline, inopportune port queries can cause benign error messages to be logged. To deal with this, cache the registers just before setting the LCB to offline, allowing queries to return without eliciting the error. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/hfi1: Race hazard avoidance in user SDMA driverMichael J. Ruhl2017-04-051-1/+2
| | | | | | | | | | Set the errcode before the state and add the smb_wmb() to avoid a potential race condition with the user. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Michael Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/hfi1: Force logical link downDean Luick2017-04-051-18/+68
| | | | | | | | | | | | If the logical link state does not read as down when the physical link state is offline, force it to down. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/IPoIB: ibX: failed to create mcg debug fileShamir Rabinovitch2017-04-053-8/+42
| | | | | | | | | | | | | | | | When udev renames the netdev devices, ipoib debugfs entries does not get renamed. As a result, if subsequent probe of ipoib device reuse the name then creating a debugfs entry for the new device would fail. Also, moved ipoib_create_debug_files and ipoib_delete_debug_files as part of ipoib event handling in order to avoid any race condition between these. Fixes: 1732b0ef3b3a ([IPoIB] add path record information in debugfs) Cc: stable@vger.kernel.org # 2.6.15+ Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/hns: Explicitly include linux/of.hMark Brown2017-04-051-0/+1
| | | | | | | | | hns_roce_hw_v1.c uses DT interfaces but relies on implict inclusion of linux/of.h which means that changes in other headers could break the build, as happened in -next for arm64 today. Add an explicit include. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Change completion channel to use the reworked objects schemaMatan Barak2017-04-056-148/+257
| | | | | | | | | | | | | | | | | | | | | | | This patch adds the standard fd based type - completion_channel. The completion_channel is now prefixed with ib_uobject, similarly to the rest of the uobjects. This requires a few changes: (1) We define a new completion channel fd based object type. (2) completion_event and async_event are now two different types. This means they use different fops. (3) We release the completion_channel exactly as we release other idr based objects. (4) Since ib_uobjects are already kref-ed, we only add the kref to the async event. A fd object requires filling out several parameters. Its op pointer should point to uverbs_fd_ops and its size should be at least the size if ib_uobject. We use a macro to make the type declaration easier. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Add support for fd objectsMatan Barak2017-04-056-2/+210
| | | | | | | | | | | | | | | | | The completion channel we use in verbs infrastructure is FD based. Previously, we had a separate way to manage this object. Since we strive for a single way to manage any kind of object in this infrastructure, we conceptually treat all objects as subclasses of ib_uobject. This commit adds the necessary mechanism to support FD based objects like their IDR counterparts. FD objects release need to be synchronized with context release. We use the cleanup_mutex on the uverbs_file for that. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Add lock to multicast handlersMatan Barak2017-04-052-0/+7
| | | | | | | | | | | | | | | | | | When two handlers used the same object in the old schema, we blocked the process in the kernel. The new schema just returns -EBUSY. This could lead to different behaviour in applications between the old schema and the new schema. In most cases, using such handlers concurrently could lead to crashing the process. For example, if thread A destroys a QP and thread B modifies it, we could have the destruction happens before the modification. In this case, we are accessing freed memory which could lead to crashing the process. This is true for most cases. However, attaching and detaching a multicast address from QP concurrently is safe. Therefore, we preserve the original behaviour by adding a lock there. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Change idr objects to use the new schemaMatan Barak2017-04-056-1144/+402
| | | | | | | | | | | | | | | | | | | | | | | This changes only the handlers which deals with idr based objects to use the new idr allocation, fetching and destruction schema. This patch consists of the following changes: (1) Allocation, fetching and destruction is done via idr ops. (2) Context initializing and release is done through uverbs_initialize_ucontext and uverbs_cleanup_ucontext. (3) Ditching the live flag. Mostly, this is pretty straight forward. The only place that is a bit trickier is in ib_uverbs_open_qp. Commit [1] added code to check whether the uobject is already live and initialized. This mostly happens because of a race between open_qp and events. We delayed assigning the uobject's pointer in order to eliminate this race without using the live variable. [1] commit a040f95dc819 ("IB/core: Fix XRC race condition in ib_uverbs_open_qp") Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Add idr based standard typesMatan Barak2017-04-057-10/+329
| | | | | | | | | | | | | | | This patch adds the standard idr based types. These types are used in downstream patches in order to initialize, destroy and lookup IB standard objects which are based on idr objects. An idr object requires filling out several parameters. Its op pointer should point to uverbs_idr_ops and its size should be at least the size of ib_uobject. We add a macro to make the type declaration easier. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Add support for idr typesMatan Barak2017-04-055-1/+660
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new ioctl infrastructure supports driver specific objects. Each such object type has a hot unplug function, allocation size and an order of destruction. When a ucontext is created, a new list is created in this ib_ucontext. This list contains all objects created under this ib_ucontext. When a ib_ucontext is destroyed, we traverse this list several time destroying the various objects by the order mentioned in the object type description. If few object types have the same destruction order, they are destroyed in an order opposite to their creation. Adding an object is done in two parts. First, an object is allocated and added to idr tree. Then, the command's handlers (in downstream patches) could work on this object and fill in its required details. After a successful command, the commit part is called and the user objects become ucontext visible. If the handler failed, alloc_abort should be called. Removing an uboject is done by calling lookup_get with the write flag and finalizing it with destroy_commit. A major change from the previous code is that we actually destroy the kernel object itself in destroy_commit (rather than just the uobject). We should make sure idr (per-uverbs-file) and list (per-ucontext) could be accessed concurrently without corrupting them. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Refactor idr to be per uverbs_fileMatan Barak2017-04-054-127/+95
| | | | | | | | | | | | | | | | The current code creates an idr per type. Since types are currently common for all drivers and known in advance, this was good enough. However, the proposed ioctl based infrastructure allows each driver to declare only some of the common types and declare its own specific types. Thus, we decided to implement idr to be per uverbs_file. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* Linux 4.11-rc3v4.11-rc3Linus Torvalds2017-03-191-1/+1
|
* mm/swap: don't BUG_ON() due to uninitialized swap slot cacheLinus Torvalds2017-03-191-1/+1
| | | | | | | | | | | | | | | | | | This BUG_ON() triggered for me once at shutdown, and I don't see a reason for the check. The code correctly checks whether the swap slot cache is usable or not, so an uninitialized swap slot cache is not actually problematic afaik. I've temporarily just switched the BUG_ON() to a WARN_ON_ONCE(), since I'm not sure why that seemingly pointless check was there. I suspect the real fix is to just remove it entirely, but for now we'll warn about it but not bring the machine down. Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'powerpc-4.11-5' of ↵Linus Torvalds2017-03-194-2/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull more powerpc fixes from Michael Ellerman: "A couple of minor powerpc fixes for 4.11: - wire up statx() syscall - don't print a warning on memory hotplug when HPT resizing isn't available Thanks to: David Gibson, Chandan Rajendra" * tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries: Don't give a warning when HPT resizing isn't available powerpc: Wire up statx() syscall
| * powerpc/pseries: Don't give a warning when HPT resizing isn't availableMichael Ellerman2017-03-171-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As of commit 438cc81a41e8 ("powerpc/pseries: Automatically resize HPT for memory hot add/remove"), when running on the pseries platform, we always attempt to use the PAPR extension to resize the hashed page table (HPT) when we add or remove memory. This is fine, but when the extension is not available we'll give a harmless, but scary warning. Instead check if the firmware supports HPT resizing before populating the mmu_hash_ops.resize_hpt pointer. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc: Wire up statx() syscallChandan Rajendra2017-03-163-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Test runs on a ppc64 BE guest succeeded. linux/samples/statx/test-statx program was executed on the following file types, 1. Regular file 2. Directory 3. device file 4. symlink 5. Named pipe The test run also included invoking test-statx with the runtime options provided in the main() function of test-statx.c Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | Merge branch 'parisc-4.11-2' of ↵Linus Torvalds2017-03-198-68/+88
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: - Mikulas Patocka added support for R_PARISC_SECREL32 relocations in modules with CONFIG_MODVERSIONS. - Dave Anglin optimized the cache flushing for vmap ranges. - Arvind Yadav provided a fix for a potential NULL pointer dereference in the parisc perf code (and some code cleanups). - I wired up the new statx system call, fixed some compiler warnings with the access_ok() macro and fixed shutdown code to really halt a system at shutdown instead of crashing & rebooting. * 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix system shutdown halt parisc: perf: Fix potential NULL pointer dereference parisc: Avoid compiler warnings with access_ok() parisc: Wire up statx system call parisc: Optimize flush_kernel_vmap_range and invalidate_kernel_vmap_range parisc: support R_PARISC_SECREL32 relocation in modules
| * | parisc: Fix system shutdown haltHelge Deller2017-03-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On those parisc machines which don't provide a software power off function, the system currently kills the init process at the end of a shutdown and unexpectedly restarts insteads of halting. Fix it by adding a loop which will not return. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 4.9+
| * | parisc: perf: Fix potential NULL pointer dereferenceArvind Yadav2017-03-181-45/+49
| | | | | | | | | | | | | | | | | | | | | | | | Fix potential NULL pointer dereference and clean up coding style errors (code indent, trailing whitespaces). Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>
| * | parisc: Avoid compiler warnings with access_ok()Helge Deller2017-03-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 09b871ffd4d8 (parisc: Define access_ok() as macro) missed to mark uaddr as used, which then gives compiler warnings about unused variables. Fix it by comparing uaddr to uaddr which then gets optimized away by the compiler. Signed-off-by: Helge Deller <deller@gmx.de> Fixes: 09b871ffd4d8 ("parisc: Define access_ok() as macro")
| * | parisc: Wire up statx system callHelge Deller2017-03-152-1/+3
| | | | | | | | | | | | Signed-off-by: Helge Deller <deller@gmx.de>
| * | parisc: Optimize flush_kernel_vmap_range and invalidate_kernel_vmap_rangeJohn David Anglin2017-03-152-21/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previously submitted patch did not resolve the random segmentation faults observed on the phantom buildd system. There are still unresolved problems with the Debian 4.8 and 4.9 kernels on C8000. The attached patch removes the flush of the offset map pages and does a whole data cache flush for large ranges. No other arch flushes the offset map in these routines as far as I can tell. I have not observed any random segmentation faults on rp3440 in two weeks of testing with 4.10.0 and 4.10.1. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Helge Deller <deller@gmx.de>
| * | parisc: support R_PARISC_SECREL32 relocation in modulesMikulas Patocka2017-03-151-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The parisc kernel doesn't work with CONFIG_MODVERSIONS since the commit 71810db27c1c853b335675bee335d893bc3d324b. It can't load modules with the error: "module unix: Unknown relocation: 41". The commit changes __kcrctab from 64-bit valus to 32-bit values. The assembler generates R_PARISC_SECREL32 secrel relocation for them and the module loader doesn't support this relocation. This patch adds the R_PARISC_SECREL32 relocation to the module loader. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Helge Deller <deller@gmx.de>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds2017-03-1925-548/+1274
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull SCSI target fixes from Nicholas Bellinger: "The bulk of the changes are in qla2xxx target driver code to address various issues found during Cavium/QLogic's internal testing (stable CC's included), along with a few other stability and smaller miscellaneous improvements. There are also a couple of different patch sets from Mike Christie, which have been a result of his work to use target-core ALUA logic together with tcm-user backend driver. Finally, a patch to address some long standing issues with pass-through SCSI export of TYPE_TAPE + TYPE_MEDIUM_CHANGER devices, which will make folks using physical (or virtual) magnetic tape happy" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (28 commits) qla2xxx: Update driver version to 9.00.00.00-k qla2xxx: Fix delayed response to command for loop mode/direct connect. qla2xxx: Change scsi host lookup method. qla2xxx: Add DebugFS node to display Port Database qla2xxx: Use IOCB interface to submit non-critical MBX. qla2xxx: Add async new target notification qla2xxx: Export DIF stats via debugfs qla2xxx: Improve T10-DIF/PI handling in driver. qla2xxx: Allow relogin to proceed if remote login did not finish qla2xxx: Fix sess_lock & hardware_lock lock order problem. qla2xxx: Fix inadequate lock protection for ABTS. qla2xxx: Fix request queue corruption. qla2xxx: Fix memory leak for abts processing qla2xxx: Allow vref count to timeout on vport delete. tcmu: Convert cmd_time_out into backend device attribute tcmu: make cmd timeout configurable tcmu: add helper to check if dev was configured target: fix race during implicit transition work flushes target: allow userspace to set state to transitioning target: fix ALUA transition timeout handling ...
| * | | qla2xxx: Update driver version to 9.00.00.00-kHimanshu Madhani2017-03-181-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | qla2xxx: Fix delayed response to command for loop mode/direct connect.Quinn Tran2017-03-186-20/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current driver wait for FW to be in the ready state before processing in-coming commands. For Arbitrated Loop or Point-to- Point (not switch), FW Ready state can take a while. FW will transition to ready state after all Nports have been logged in. In the mean time, certain initiators have completed the login and starts IO. Driver needs to start processing all queues if FW is already started. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | qla2xxx: Change scsi host lookup method.Quinn Tran2017-03-187-40/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For target mode, when new scsi command arrive, driver first performs a look up of the SCSI Host. The current look up method is based on the ALPA portion of the NPort ID. For Cisco switch, the ALPA can not be used as the index. Instead, the new search method is based on the full value of the Nport_ID via btree lib. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | qla2xxx: Add DebugFS node to display Port DatabaseHimanshu Madhani2017-03-182-4/+90
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | qla2xxx: Use IOCB interface to submit non-critical MBX.Quinn Tran2017-03-186-65/+279
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Mailbox interface is currently over subscribed. We like to reserve the Mailbox interface for the chip managment and link initialization. Any non essential Mailbox command will be routed through the IOCB interface. The IOCB interface is able to absorb more commands. Following commands are being routed through IOCB interface - Get ID List (007Ch) - Get Port DB (0064h) - Get Link Priv Stats (006Dh) Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | qla2xxx: Add async new target notificationQuinn Tran2017-03-182-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | qla2xxx: Export DIF stats via debugfsAnil Gurumurthy2017-03-182-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | qla2xxx: Improve T10-DIF/PI handling in driver.Quinn Tran2017-03-187-251/+406
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add routines to support T10 DIF tag. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | qla2xxx: Allow relogin to proceed if remote login did not finishQuinn Tran2017-03-184-8/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the remote port have started the login process, then the PLOGI and PRLI should be back to back. Driver will allow the remote port to complete the process. For the case where the remote port decide to back off from sending PRLI, this local port sets an expiration timer for the PRLI. Once the expiration time passes, the relogin retry logic is allowed to go through and perform login with the remote port. Signed-off-by: Quinn Tran <quinn.tran@qlogic.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | qla2xxx: Fix sess_lock & hardware_lock lock order problem.Quinn Tran2017-03-181-23/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main lock that needs to be held for CMD or TMR submission to upper layer is the sess_lock. The sess_lock is used to serialize cmd submission and session deletion. The addition of hardware_lock being held is not necessary. This patch removes hardware_lock dependency from CMD/TMR submission. Use hardware_lock only for error response in this case. Path1 CPU0 CPU1 ---- ---- lock(&(&ha->tgt.sess_lock)->rlock); lock(&(&ha->hardware_lock)->rlock); lock(&(&ha->tgt.sess_lock)->rlock); lock(&(&ha->hardware_lock)->rlock); Path2/deadlock *** DEADLOCK *** Call Trace: dump_stack+0x85/0xc2 print_circular_bug+0x1e3/0x250 __lock_acquire+0x1425/0x1620 lock_acquire+0xbf/0x210 _raw_spin_lock_irqsave+0x53/0x70 qlt_sess_work_fn+0x21d/0x480 [qla2xxx] process_one_work+0x1f4/0x6e0 Cc: <stable@vger.kernel.org> Cc: Bart Van Assche <Bart.VanAssche@sandisk.com> Reported-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | qla2xxx: Fix inadequate lock protection for ABTS.Quinn Tran2017-03-181-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Normally, ABTS is sent to Target Core as Task MGMT command. In the case of error, qla2xxx needs to send response, hardware_lock is required to prevent request queue corruption. Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | qla2xxx: Fix request queue corruption.Quinn Tran2017-03-181-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When FW notify driver or driver detects low FW resource, driver tries to send out Busy SCSI Status to tell Initiator side to back off. During the send process, the lock was not held. Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@qlogic.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | qla2xxx: Fix memory leak for abts processingQuinn Tran2017-03-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | qla2xxx: Allow vref count to timeout on vport delete.Joe Carnuccio2017-03-185-10/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Cc: <stable@vger.kernel.org> Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | tcmu: Convert cmd_time_out into backend device attributeNicholas Bellinger2017-03-181-26/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of putting cmd_time_out under ../target/core/user_0/foo/control, which has historically been used by parameters needed for initial backend device configuration, go ahead and move cmd_time_out into a backend device attribute. In order to do this, tcmu_module_init() has been updated to create a local struct configfs_attribute **tcmu_attrs, that is based upon the existing passthrough_attrib_attrs along with the new cmd_time_out attribute. Once **tcm_attrs has been setup, go ahead and point it at tcmu_ops->tb_dev_attrib_attrs so it's picked up by target-core. Also following MNC's previous change, ->cmd_time_out is stored in milliseconds but exposed via configfs in seconds. Also, note this patch restricts the modification of ->cmd_time_out to before + after the TCMU device has been configured, but not while it has active fabric exports. Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | tcmu: make cmd timeout configurableMike Christie2017-03-181-6/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A single daemon could implement multiple types of devices using multuple types of real devices that may not support restarting from crashes and/or handling tcmu timeouts. This makes the cmd timeout configurable, so handlers that do not support it can turn if off for now. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | tcmu: add helper to check if dev was configuredMike Christie2017-03-181-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a helper to check if the dev was configured. It will be used in the next patch to prevent updates to some config settings after the device has been setup. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | target: fix race during implicit transition work flushesMike Christie2017-03-181-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the following races: 1. core_alua_do_transition_tg_pt could have read tg_pt_gp_alua_access_state and gone into this if chunk: if (!explicit && atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state) == ALUA_ACCESS_STATE_TRANSITION) { and then core_alua_do_transition_tg_pt_work could update the state. core_alua_do_transition_tg_pt would then only set tg_pt_gp_alua_pending_state and the tg_pt_gp_alua_access_state would not get updated with the second calls state. 2. core_alua_do_transition_tg_pt could be setting tg_pt_gp_transition_complete while the tg_pt_gp_transition_work is already completing. core_alua_do_transition_tg_pt then waits on the completion that will never be called. To handle these issues, we just call flush_work which will return when core_alua_do_transition_tg_pt_work has completed so there is no need to do the complete/wait. And, if core_alua_do_transition_tg_pt_work was running, instead of trying to sneak in the state change, we just schedule up another core_alua_do_transition_tg_pt_work call. Note that this does not handle a possible race where there are multiple threads call core_alua_do_transition_tg_pt at the same time. I think we need a mutex in target_tg_pt_gp_alua_access_state_store. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | target: allow userspace to set state to transitioningMike Christie2017-03-181-15/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Userspace target_core_user handlers like tcmu-runner may want to set the ALUA state to transitioning while it does implicit transitions. This patch allows that state when set from configfs. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | target: fix ALUA transition timeout handlingMike Christie2017-03-182-16/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The implicit transition time tells initiators the min time to wait before timing out a transition. We currently schedule the transition to occur in tg_pt_gp_implicit_trans_secs seconds so there is no room for delays. If core_alua_do_transition_tg_pt_work->core_alua_update_tpg_primary_metadata needs to write out info to a remote file, then the initiator can easily time out the operation. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | target: Use system workqueue for ALUA transitionsMike Christie2017-03-181-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If tcmu-runner is processing a STPG and needs to change the kernel's ALUA state then we cannot use the same work queue for task management requests and ALUA transitions, because we could deadlock. The problem occurs when a STPG times out before tcmu-runner is able to call into target_tg_pt_gp_alua_access_state_store-> core_alua_do_port_transition -> core_alua_do_transition_tg_pt -> queue_work. In this case, the tmr is on the work queue waiting for the STPG to complete, but the STPG transition is now queued behind the waiting tmr. Note: This bug will also be fixed by this patch: http://www.spinics.net/lists/target-devel/msg14560.html which switches the tmr code to use the system workqueues. For both, I am not sure if we need a dedicated workqueue since it is not a performance path and I do not think we need WQ_MEM_RECLAIM to make forward progress to free up memory like the block layer does. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | target: fail ALUA transitions for pscsiMike Christie2017-03-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do not setup the LU group for pscsi devices, so if you write a state to alua_access_state that will cause a transition you will get a NULL pointer dereference. This patch will fail attempts to try and transition the path for backend devices that set the TRANSPORT_FLAG_PASSTHROUGH_ALUA flag. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>