summaryrefslogtreecommitdiffstats
path: root/drivers/target
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2018-12-2821-570/+500
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: smarpqi, lpfc, qedi, megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas. Additionally, we have a pile of annotation, unused variable and minor updates. The big API change is the updates for Christoph's DMA rework which include removing the DISABLE_CLUSTERING flag. And finally there are a couple of target tree updates" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (259 commits) scsi: isci: request: mark expected switch fall-through scsi: isci: remote_node_context: mark expected switch fall-throughs scsi: isci: remote_device: Mark expected switch fall-throughs scsi: isci: phy: Mark expected switch fall-through scsi: iscsi: Capture iscsi debug messages using tracepoints scsi: myrb: Mark expected switch fall-throughs scsi: megaraid: fix out-of-bound array accesses scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-through scsi: fcoe: remove set but not used variable 'port' scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown() scsi: smartpqi: fix build warnings scsi: smartpqi: update driver version scsi: smartpqi: add ofa support scsi: smartpqi: increase fw status register read timeout scsi: smartpqi: bump driver version scsi: smartpqi: add smp_utils support scsi: smartpqi: correct lun reset issues scsi: smartpqi: correct volume status scsi: smartpqi: do not offline disks for transient did no connect conditions scsi: smartpqi: allow for larger raid maps ...
| * scsi: remove the use_clustering flagChristoph Hellwig2018-12-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The same effects can be achieved by setting the dma_boundary to PAGE_SIZE - 1 and the max_segment_size to PAGE_SIZE, so shift those settings into the drivers. Note that in many cases the setting might be bogus, but this keeps the status quo. [mkp: fix myrs and myrb] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target/core: Use kmem_cache_free() instead of kfree()Wei Yongjun2018-12-181-1/+1
| | | | | | | | | | | | | | | | | | | | memory allocated by kmem_cache_alloc() should be freed using kmem_cache_free(), not kfree(). Fixes: ad669505c4e9 ("scsi: target/core: Make sure that target_wait_for_sess_cmds() waits long enough") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target: perform t10_wwn ID initialisation in target_alloc_device()David Disseldorp2018-12-071-14/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initialise the t10_wwn vendor, model and revision defaults when a device is allocated instead of when it's enabled. This ensures that custom vendor or model strings set prior to enablement are not later overwritten with default values. The TRANSPORT_FLAG_PASSTHROUGH conditional can be dropped for the following reasons: - target_core_pscsi overwrites the defaults in the pscsi_configure_device() callback. + the contents is then only used for ConfigFS via $pscsi_dev/statistics/scsi_lu/vend, etc. - target_core_user doesn't touch the defaults, nor are they used for anything outside of ConfigFS. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target: remove hardcoded T10 Vendor ID in INQUIRY responseDavid Disseldorp2018-12-071-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | Use the value stored in t10_wwn.vendor, which defaults to "LIO-ORG", but can be reconfigured via the vendor_id ConfigFS attribute. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Bryant G. Ly <bly@catalogicsoftware.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target: add device vendor_id configfs attributeDavid Disseldorp2018-12-071-0/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The vendor_id attribute will allow for the modification of the T10 Vendor Identification string returned in inquiry responses. Its value can be viewed and modified via the ConfigFS path at: target/core/$backstore/$name/wwn/vendor_id "LIO-ORG" remains the default value, which is set when the backstore device is enabled. [mkp: corrected VPD page number] Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target: consistently null-terminate t10_wwn stringsDavid Disseldorp2018-12-075-99/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for supporting user provided vendor strings, add an extra byte to the vendor, model and revision arrays in struct t10_wwn. This ensures that the full INQUIRY data can be carried in the arrays along with a null-terminator. Change a number of array readers and writers so that they account for explicit null-termination: - The pscsi_set_inquiry_info() and emulate_model_alias_store() codepaths don't currently explicitly null-terminate; fix this. - Existing t10_wwn field dumps use for-loops which step over null-terminators for right-padding. + Use printf with width specifiers instead. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target: use consistent left-aligned ASCII INQUIRY dataDavid Disseldorp2018-12-071-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | spc5r17.pdf specifies: 4.3.1 ASCII data field requirements ASCII data fields shall contain only ASCII printable characters (i.e., code values 20h to 7Eh) and may be terminated with one or more ASCII null (00h) characters. ASCII data fields described as being left-aligned shall have any unused bytes at the end of the field (i.e., highest offset) and the unused bytes shall be filled with ASCII space characters (20h). LIO currently space-pads the T10 VENDOR IDENTIFICATION and PRODUCT IDENTIFICATION fields in the standard INQUIRY data. However, the PRODUCT REVISION LEVEL field in the standard INQUIRY data as well as the T10 VENDOR IDENTIFICATION field in the INQUIRY Device Identification VPD Page are zero-terminated/zero-padded. Fix this inconsistency by using space-padding for all of the above fields. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bryant G. Ly <bly@catalogicsoftware.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target/core: Reduce the amount of code executed with a spinlock heldBart Van Assche2018-12-071-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to the "make ABORT and LUN RESET handling synchronous" patch, cmd->work is only modified from the regular command execution path and no longer asynchronously by the code that executes task management functions. Since the regular command execution code is sequential per command, no locking is required to manipulate cmd->work. Hence stop protecting cmd->work manipulations with locking. Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Disseldorp <ddiss@suse.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target/core: Make ABORT and LUN RESET handling synchronousBart Van Assche2018-12-073-134/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of invoking target driver callback functions from the context that handles an abort or LUN RESET task management function, only set the abort flag from that context and perform the actual abort handling from the context of the regular command processing flow. This approach has the advantage that the task management code becomes much easier to read and to verify since the number of potential race conditions against the command processing flow is strongly reduced. This patch has been tested by running the following two shell commands concurrently for about ten minutes for both the iSCSI and the SRP target drivers ($dev is an initiator device node connected with storage provided by the target driver under test): * fio with data verification enabled on a filesystem mounted on top of $dev. * while true; do sg_reset -d $dev; echo -n .; sleep .1; done Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Disseldorp <ddiss@suse.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target/core: Fix TAS handling for aborted commandsBart Van Assche2018-12-074-87/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The TASK ABORTED STATUS (TAS) bit is defined as follows in SAM: "TASK_ABORTED: this status shall be returned if a command is aborted by a command or task management function on another I_T nexus and the control mode page TAS bit is set to one". TAS handling is spread over the target core and the iSCSI target driver. If a LUN RESET is received, the target core will send the TASK_ABORTED response for all commands for which such a response has to be sent. If an ABORT TASK is received, only the iSCSI target driver will send the TASK_ABORTED response for the commands for which that response has to be sent. That is a bug since all target drivers have to honor the TAS bit. Fix this by moving the code that handles TAS from the iSCSI target driver into the target core. Additionally, if a command has been aborted, instead of sending the TASK_ABORTED status from the context that processes the SCSI command send it from the context of the ABORT TMF. The core_tmr_abort_task() change in this patch causes the CMD_T_TAS flag to be set if a TASK_ABORTED status has to be sent back to the initiator that submitted the command. If that flag has been set transport_cmd_finish_abort() will send the TASK_ABORTED response. Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Disseldorp <ddiss@suse.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target/core: Simplify the code for aborting SCSI commandsBart Van Assche2018-12-072-20/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of allowing the code that aborts a SCSI command to finish before all iSCSI data frames have been received, make that code wait until all iSCSI data frames have been received. Introduce a new member variable in the target driver template to communicate that information from the iSCSI target driver to the target core. This change allows to leave out the check whether or not it is already safe to send the TASK_ABORTED reply from transport_send_task_abort(). Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Disseldorp <ddiss@suse.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target/core: Make it possible to wait from more than one context for ↵Bart Van Assche2018-12-071-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | command completion This patch does not change any functionality but makes the patch that makes TMF handling synchronous easier to read. Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Disseldorp <ddiss@suse.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target/core: Use system workqueues for TMFBart Van Assche2018-12-072-17/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A quote from SAM-5: "The order in which task management requests are processed is not specified by the SCSI architecture model. The SCSI architecture model does not require in-order delivery of such task management requests or processing by the task manager in the order received. To guarantee the processing order of task management requests referencing sent to a specific logical unit, an application client should not have more than one such task management request pending to that logical unit." This means that it is safe to use the system workqueues instead of tmr_wq for processing TMFs. An intended side effect of this patch is that it enables concurrent processing of TMFs. Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target/core: Make sure that target_wait_for_sess_cmds() waits long enoughBart Van Assche2018-12-072-11/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A session must only be released after all code that accesses the session structure has finished. Make sure that this is the case by introducing a new command counter per session that is only decremented after the .release_cmd() callback has finished. This patch fixes the following crash: BUG: KASAN: use-after-free in do_raw_spin_lock+0x1c/0x130 Read of size 4 at addr ffff8801534b16e4 by task rmdir/14805 CPU: 16 PID: 14805 Comm: rmdir Not tainted 4.18.0-rc2-dbg+ #5 Call Trace: dump_stack+0xa4/0xf5 print_address_description+0x6f/0x270 kasan_report+0x241/0x360 __asan_load4+0x78/0x80 do_raw_spin_lock+0x1c/0x130 _raw_spin_lock_irqsave+0x52/0x60 srpt_set_ch_state+0x27/0x70 [ib_srpt] srpt_disconnect_ch+0x1b/0xc0 [ib_srpt] srpt_close_session+0xa8/0x260 [ib_srpt] target_shutdown_sessions+0x170/0x180 [target_core_mod] core_tpg_del_initiator_node_acl+0xf3/0x200 [target_core_mod] target_fabric_nacl_base_release+0x25/0x30 [target_core_mod] config_item_release+0x9c/0x110 [configfs] config_item_put+0x26/0x30 [configfs] configfs_rmdir+0x3b8/0x510 [configfs] vfs_rmdir+0xb3/0x1e0 do_rmdir+0x262/0x2c0 do_syscall_64+0x77/0x230 entry_SYSCALL_64_after_hwframe+0x49/0xbe Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Disseldorp <ddiss@suse.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target/core: Simplify transport_clear_lun_ref()Bart Van Assche2018-12-073-32/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since transport_clear_lun_ref() already waits until the percpu-refcount .release() method is called, it is not necessary to wait first until percpu_ref_kill_and_confirm() has finished transitioning the refcount into atomic mode. Remove the code that waits for percpu_ref_kill_and_confirm() to complete and also the completion object that is used by that code. This patch does not change the behavior of the SCSI target code. Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Disseldorp <ddiss@suse.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target/core: Use kvcalloc() instead of open-coding itBart Van Assche2018-12-071-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch does not change any functionality. Note: the code that frees sess_cmd_map already uses kvfree() so that code does not need to be modified. Reviewed-by: David Disseldorp <ddiss@suse.de> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target/tcmu: Fix queue_cmd_ring() declarationBart Van Assche2018-12-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch does not change any functionality but avoids that sparse complains about the queue_cmd_ring() function and its callers. Fixes: 6fd0ce79724d ("tcmu: prep queue_cmd_ring to be used by unmap wq") Reviewed-by: David Disseldorp <ddiss@suse.de> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target: replace fabric_ops.name with fabric_aliasDavid Disseldorp2018-11-285-16/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | iscsi_target_mod is the only LIO fabric where fabric_ops.name differs from the fabric_ops.fabric_name string. fabric_ops.name is used when matching target/$fabric ConfigFS create paths, so rename it .fabric_alias and fallback to target/$fabric vs .fabric_name comparison if .fabric_alias isn't initialised. iscsi_target_mod is the only fabric module to set .fabric_alias . All other fabric modules rely on .fabric_name matching and can drop the duplicate string. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target: drop unnecessary get_fabric_name() accessor from fabric_opsDavid Disseldorp2018-11-2815-122/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All fabrics return a const string. In all cases *except* iSCSI the get_fabric_name() string matches fabric_ops.name. Both fabric_ops.get_fabric_name() and fabric_ops.name are user-facing, with the former being used for PR/ALUA state and the latter for ConfigFS (config/target/$name), so we unfortunately need to keep both strings around for now. Replace the useless .get_fabric_name() accessor function with a const string fabric_name member variable. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target: drop unused pi_prot_format attribute storageDavid Disseldorp2018-11-281-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | On write, the pi_prot_format configfs attribute invokes the device format_prot() callback if present. Read dumps the contents of se_dev_attrib.pi_prot_format which is always zero. Make the configfs attribute write-only, and drop the always zero se_dev_attrib.pi_prot_format storage. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target: add emulate_pr backstore attr to toggle PR supportDavid Disseldorp2018-11-214-6/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new emulate_pr backstore attribute allows for Persistent Reservation and SCSI2 RESERVE/RELEASE support to be completely disabled. This can be useful for scenarios such as: - Ensuring ATS (Compare & Write) usage on recent VMware ESXi initiators. - Allowing clustered (e.g. tcm-user) backends to block such requests, avoiding the multi-node reservation state propagation. When explicitly disabled, PR and RESERVE/RELEASE requests receive Invalid Command Operation Code response sense data. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | Merge tag 'for-4.21/block-20181221' of git://git.kernel.dk/linux-blockLinus Torvalds2018-12-282-6/+8
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block updates from Jens Axboe: "This is the main pull request for block/storage for 4.21. Larger than usual, it was a busy round with lots of goodies queued up. Most notable is the removal of the old IO stack, which has been a long time coming. No new features for a while, everything coming in this week has all been fixes for things that were previously merged. This contains: - Use atomic counters instead of semaphores for mtip32xx (Arnd) - Cleanup of the mtip32xx request setup (Christoph) - Fix for circular locking dependency in loop (Jan, Tetsuo) - bcache (Coly, Guoju, Shenghui) * Optimizations for writeback caching * Various fixes and improvements - nvme (Chaitanya, Christoph, Sagi, Jay, me, Keith) * host and target support for NVMe over TCP * Error log page support * Support for separate read/write/poll queues * Much improved polling * discard OOM fallback * Tracepoint improvements - lightnvm (Hans, Hua, Igor, Matias, Javier) * Igor added packed metadata to pblk. Now drives without metadata per LBA can be used as well. * Fix from Geert on uninitialized value on chunk metadata reads. * Fixes from Hans and Javier to pblk recovery and write path. * Fix from Hua Su to fix a race condition in the pblk recovery code. * Scan optimization added to pblk recovery from Zhoujie. * Small geometry cleanup from me. - Conversion of the last few drivers that used the legacy path to blk-mq (me) - Removal of legacy IO path in SCSI (me, Christoph) - Removal of legacy IO stack and schedulers (me) - Support for much better polling, now without interrupts at all. blk-mq adds support for multiple queue maps, which enables us to have a map per type. This in turn enables nvme to have separate completion queues for polling, which can then be interrupt-less. Also means we're ready for async polled IO, which is hopefully coming in the next release. - Killing of (now) unused block exports (Christoph) - Unification of the blk-rq-qos and blk-wbt wait handling (Josef) - Support for zoned testing with null_blk (Masato) - sx8 conversion to per-host tag sets (Christoph) - IO priority improvements (Damien) - mq-deadline zoned fix (Damien) - Ref count blkcg series (Dennis) - Lots of blk-mq improvements and speedups (me) - sbitmap scalability improvements (me) - Make core inflight IO accounting per-cpu (Mikulas) - Export timeout setting in sysfs (Weiping) - Cleanup the direct issue path (Jianchao) - Export blk-wbt internals in block debugfs for easier debugging (Ming) - Lots of other fixes and improvements" * tag 'for-4.21/block-20181221' of git://git.kernel.dk/linux-block: (364 commits) kyber: use sbitmap add_wait_queue/list_del wait helpers sbitmap: add helpers for add/del wait queue handling block: save irq state in blkg_lookup_create() dm: don't reuse bio for flushes nvme-pci: trace SQ status on completions nvme-rdma: implement polling queue map nvme-fabrics: allow user to pass in nr_poll_queues nvme-fabrics: allow nvmf_connect_io_queue to poll nvme-core: optionally poll sync commands block: make request_to_qc_t public nvme-tcp: fix spelling mistake "attepmpt" -> "attempt" nvme-tcp: fix endianess annotations nvmet-tcp: fix endianess annotations nvme-pci: refactor nvme_poll_irqdisable to make sparse happy nvme-pci: only set nr_maps to 2 if poll queues are supported nvmet: use a macro for default error location nvmet: fix comparison of a u16 with -1 blk-mq: enable IO poll if .nr_queues of type poll > 0 blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight() blk-mq: skip zero-queue maps in blk_mq_map_swqueue ...
| * | sbitmap: optimize wakeup checkJens Axboe2018-11-301-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even if we have no waiters on any of the sbitmap_queue wait states, we still have to loop every entry to check. We do this for every IO, so the cost adds up. Shift a bit of the cost to the slow path, when we actually have waiters. Wrap prepare_to_wait_exclusive() and finish_wait(), so we can maintain an internal count of how many are currently active. Then we can simply check this count in sbq_wake_ptr() and not have to loop if we don't have any sleepers. Convert the two users of sbitmap with waiting, blk-mq-tag and iSCSI. Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | Merge tag 'v4.20-rc3' into for-4.21/blockJens Axboe2018-11-181-2/+2
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge in -rc3 to resolve a few conflicts, but also to get a few important fixes that have gone into mainline since the block 4.21 branch was forked off (most notably the SCSI queue issue, which is both a conflict AND needed fix). Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | block: remove __blk_put_request()Jens Axboe2018-11-071-1/+1
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | Now there's no difference between blk_put_request() and __blk_put_request() anymore, get rid of the underscore version and convert the few callers. Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2018-12-271-4/+4
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) New ipset extensions for matching on destination MAC addresses, from Stefano Brivio. 2) Add ipv4 ttl and tos, plus ipv6 flow label and hop limit offloads to nfp driver. From Stefano Brivio. 3) Implement GRO for plain UDP sockets, from Paolo Abeni. 4) Lots of work from Michał Mirosław to eliminate the VLAN_TAG_PRESENT bit so that we could support the entire vlan_tci value. 5) Rework the IPSEC policy lookups to better optimize more usecases, from Florian Westphal. 6) Infrastructure changes eliminating direct manipulation of SKB lists wherever possible, and to always use the appropriate SKB list helpers. This work is still ongoing... 7) Lots of PHY driver and state machine improvements and simplifications, from Heiner Kallweit. 8) Various TSO deferral refinements, from Eric Dumazet. 9) Add ntuple filter support to aquantia driver, from Dmitry Bogdanov. 10) Batch dropping of XDP packets in tuntap, from Jason Wang. 11) Lots of cleanups and improvements to the r8169 driver from Heiner Kallweit, including support for ->xmit_more. This driver has been getting some much needed love since he started working on it. 12) Lots of new forwarding selftests from Petr Machata. 13) Enable VXLAN learning in mlxsw driver, from Ido Schimmel. 14) Packed ring support for virtio, from Tiwei Bie. 15) Add new Aquantia AQtion USB driver, from Dmitry Bezrukov. 16) Add XDP support to dpaa2-eth driver, from Ioana Ciocoi Radulescu. 17) Implement coalescing on TCP backlog queue, from Eric Dumazet. 18) Implement carrier change in tun driver, from Nicolas Dichtel. 19) Support msg_zerocopy in UDP, from Willem de Bruijn. 20) Significantly improve garbage collection of neighbor objects when the table has many PERMANENT entries, from David Ahern. 21) Remove egdev usage from nfp and mlx5, and remove the facility completely from the tree as it no longer has any users. From Oz Shlomo and others. 22) Add a NETDEV_PRE_CHANGEADDR so that drivers can veto the change and therefore abort the operation before the commit phase (which is the NETDEV_CHANGEADDR event). From Petr Machata. 23) Add indirect call wrappers to avoid retpoline overhead, and use them in the GRO code paths. From Paolo Abeni. 24) Add support for netlink FDB get operations, from Roopa Prabhu. 25) Support bloom filter in mlxsw driver, from Nir Dotan. 26) Add SKB extension infrastructure. This consolidates the handling of the auxiliary SKB data used by IPSEC and bridge netfilter, and is designed to support the needs to MPTCP which could be integrated in the future. 27) Lots of XDP TX optimizations in mlx5 from Tariq Toukan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1845 commits) net: dccp: fix kernel crash on module load drivers/net: appletalk/cops: remove redundant if statement and mask bnx2x: Fix NULL pointer dereference in bnx2x_del_all_vlans() on some hw net/net_namespace: Check the return value of register_pernet_subsys() net/netlink_compat: Fix a missing check of nla_parse_nested ieee802154: lowpan_header_create check must check daddr net/mlx4_core: drop useless LIST_HEAD mlxsw: spectrum: drop useless LIST_HEAD net/mlx5e: drop useless LIST_HEAD iptunnel: Set tun_flags in the iptunnel_metadata_reply from src net/mlx5e: fix semicolon.cocci warnings staging: octeon: fix build failure with XFRM enabled net: Revert recent Spectre-v1 patches. can: af_can: Fix Spectre v1 vulnerability packet: validate address length if non-zero nfc: af_nfc: Fix Spectre v1 vulnerability phonet: af_phonet: Fix Spectre v1 vulnerability net: core: Fix Spectre v1 vulnerability net: minor cleanup in skb_ext_add() net: drop the unused helper skb_ext_get() ...
| * | | cxgb4: use new fw interface to get the VIN and smt indexSantosh Rastapur2018-11-231-4/+4
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | If the fw supports returning VIN/VIVLD in FW_VI_CMD save it in port_info structure else retrieve these from viid and save them in port_info structure. Do the same for smt_idx from FW_VI_MAC_CMD Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | scsi: target: iscsi: cxgbit: add missing spin_lock_init()Varun Prakash2018-12-121-0/+1
| | | | | | | | | | | | | | | | | | | | | Add missing spin_lock_init() for cdev->np_lock. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: target: iscsi: cxgbit: fix csk leakVarun Prakash2018-12-121-1/+4
|/ / | | | | | | | | | | | | In case of arp failure call cxgbit_put_csk() to free csk. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* / scsi: target/core: Avoid that a kernel oops is triggered when COMPARE AND ↵Bart Van Assche2018-11-051-2/+2
|/ | | | | | | | | | | | | WRITE fails Fixes: aa73237dcb2d ("scsi: target/core: Always call transport_complete_callback() upon failure") Reviewed-by: David Disseldorp <ddiss@suse.de> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2018-11-032-5/+5
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull more SCSI updates from James Bottomley: "This is a set of minor small (and safe changes) that didn't make the initial pull request plus some bug fixes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: mvsas: Remove set but not used variable 'id' scsi: qla2xxx: Remove two arguments from qlafx00_error_entry() scsi: qla2xxx: Make sure that qlafx00_ioctl_iosb_entry() initializes 'res' scsi: qla2xxx: Remove a set-but-not-used variable scsi: qla2xxx: Make qla2x00_sysfs_write_nvram() easier to analyze scsi: qla2xxx: Declare local functions 'static' scsi: qla2xxx: Improve several kernel-doc headers scsi: qla2xxx: Modify fall-through annotations scsi: 3w-sas: 3w-9xxx: Use unsigned char for cdb scsi: mvsas: Use dma_pool_zalloc scsi: target: Don't request modules that aren't even built scsi: target: Set response length for REPORT TARGET PORT GROUPS
| * scsi: target: Don't request modules that aren't even builtRoland Dreier2018-10-231-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If, for example, I don't enable CONFIG_TCM_PSCSI, then every time I load the target subsystem, I get an annoying Unable to load target_core_pscsi kernel log message. Instead let's only request_module() on things if that code is enabled. Signed-off-by: Roland Dreier <roland@purestorage.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target: Set response length for REPORT TARGET PORT GROUPSRoland Dreier2018-10-231-1/+1
| | | | | | | | | | | | | | | | One more place where we can return the length we actually fill in. Signed-off-by: Roland Dreier <roland@purestorage.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | Merge branch 'work.afs' of ↵Linus Torvalds2018-11-012-7/+5
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull AFS updates from Al Viro: "AFS series, with some iov_iter bits included" * 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits) missing bits of "iov_iter: Separate type from direction and use accessor functions" afs: Probe multiple fileservers simultaneously afs: Fix callback handling afs: Eliminate the address pointer from the address list cursor afs: Allow dumping of server cursor on operation failure afs: Implement YFS support in the fs client afs: Expand data structure fields to support YFS afs: Get the target vnode in afs_rmdir() and get a callback on it afs: Calc callback expiry in op reply delivery afs: Fix FS.FetchStatus delivery from updating wrong vnode afs: Implement the YFS cache manager service afs: Remove callback details from afs_callback_break struct afs: Commit the status on a new file/dir/symlink afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS afs: Don't invoke the server to read data beyond EOF afs: Add a couple of tracepoints to log I/O errors afs: Handle EIO from delivery function afs: Fix TTL on VL server and address lists afs: Implement VL server rotation afs: Improve FS server rotation error handling ...
| * | iov_iter: Separate type from direction and use accessor functionsDavid Howells2018-10-242-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the iov_iter struct, separate the iterator type from the iterator direction and use accessor functions to access them in most places. Convert a bunch of places to use switch-statements to access them rather then chains of bitwise-AND statements. This makes it easier to add further iterator types. Also, this can be more efficient as to implement a switch of small contiguous integers, the compiler can use ~50% fewer compare instructions than it has to use bitwise-and instructions. Further, cease passing the iterator type into the iterator setup function. The iterator function can set that itself. Only the direction is required. Signed-off-by: David Howells <dhowells@redhat.com>
* | | Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2018-10-2515-95/+121
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull SCSI updates from James Bottomley: "This is mostly updates of the usual drivers: UFS, esp_scsi, NCR5380, qla2xxx, lpfc, libsas, hisi_sas. In addition there's a set of mostly small updates to the target subsystem a set of conversions to the generic DMA API, which do have some potential for issues in the older drivers but we'll handle those as case by case fixes. A new myrs driver for the DAC960/mylex raid controllers to replace the block based DAC960 which is also being removed by Jens in this merge window. Plus the usual slew of trivial changes" [ "myrs" stands for "MYlex Raid Scsi". Obviously. Silly of me to even wonder. There's also a "myrb" driver, where the 'b' stands for 'block'. Truly, somebody has got mad naming skillz. - Linus ] * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (237 commits) scsi: myrs: Fix the processor absent message in processor_show() scsi: myrs: Fix a logical vs bitwise bug scsi: hisi_sas: Fix NULL pointer dereference scsi: myrs: fix build failure on 32 bit scsi: fnic: replace gross legacy tag hack with blk-mq hack scsi: mesh: switch to generic DMA API scsi: ips: switch to generic DMA API scsi: smartpqi: fully convert to the generic DMA API scsi: vmw_pscsi: switch to generic DMA API scsi: snic: switch to generic DMA API scsi: qla4xxx: fully convert to the generic DMA API scsi: qla2xxx: fully convert to the generic DMA API scsi: qla1280: switch to generic DMA API scsi: qedi: fully convert to the generic DMA API scsi: qedf: fully convert to the generic DMA API scsi: pm8001: switch to generic DMA API scsi: nsp32: switch to generic DMA API scsi: mvsas: fully convert to the generic DMA API scsi: mvumi: switch to generic DMA API scsi: mpt3sas: switch to generic DMA API ...
| * | scsi: target/core: Always call transport_complete_callback() upon failureBart Van Assche2018-10-162-9/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | COMPARE AND WRITE command execution starts with a call of sbc_compare_and_write(). That function locks the caw_sem member in the backend device data structure and submits a read request to the backend driver. Upon successful completion of the read compare_and_write_callback() gets called. That last function compares the data that has been read. If it matches transport_complete_callback is set to compare_and_write_post and a write request is submitted. compare_and_write_post() submits a write request to the backend driver. XDWRITEREAD command execution starts with sbc_execute_rw() submitting a read to the backend device. Upon successful completion of the read the xdreadwrite_callback() gets called. That function xors the data that has been read with the data in the data-out buffer and stores the result in the data-in buffer. Call transport_complete_callback() not only if COMPARE AND WRITE fails but also if XDWRITEREAD fails. This makes the code more systematic. Make sure that the callback functions handle (cmd, false, NULL) argument triples fine. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target/core: Use sg_alloc_table() instead of open-coding itBart Van Assche2018-10-161-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The purpose of sg_alloc_table() is to allocate and initialize an sg-list. Use that function instead of open-coding it. This patch will make it easier to share code for caching sg-list allocations between the SCSI and NVMe target cores. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target/core: Use the SECTOR_SHIFT constantBart Van Assche2018-10-162-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of duplicating the SECTOR_SHIFT definition from <linux/blkdev.h>, use it. This patch does not change any functionality. Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target/core: Remove the SCF_COMPARE_AND_WRITE_POST flagBart Van Assche2018-10-161-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 057085e522f8 ("target: Fix race for SCF_COMPARE_AND_WRITE_POST checking") removed the code that checks the SCF_COMPARE_AND_WRITE_POST flag. Hence also remove the flag itself. Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target/core: Remove an unused data member from struct xcopy_pt_cmdBart Van Assche2018-10-161-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A value is assigned to the xcopy_op member of struct xcopy_pt_cmd but that value is never used. Hence remove the xcopy_op member. Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target/core: Fix spelling in two source code commentsBart Van Assche2018-10-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change one occurrence of "aleady" into "already" and one occurrence of "is" into "if". Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target: stash sess_err_stats on Data-Out timeoutDavid Disseldorp2018-10-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | sess_err_stats are currently filled on NOP ping timeout, but not Data-Out timeout. Stash details of Data-Out timeouts using a ISCSI_SESS_ERR_CXN_TIMEOUT value for last_sess_failure_type. Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target: split out helper for cxn timeout error stashingDavid Disseldorp2018-10-163-30/+22
| | | | | | | | | | | | | | | | | | | | | Replace existing nested code blocks with helper function calls. Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target: log NOP ping timeouts as errorsDavid Disseldorp2018-10-161-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | Events resulting in connection outages like this should be logged as errors. Include the I_T Nexus in the message to aid path identification. Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target: log Data-Out timeouts as errorsDavid Disseldorp2018-10-161-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | Data-Out timeouts resulting in connection outages should be logged as errors. Include the I_T Nexus in the message to aid path identification. Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target: use ISCSI_IQN_LEN in iscsi_target_statDavid Disseldorp2018-10-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Move the ISCSI_IQN_LEN definition up, so that it can be used in more places instead of a hardcoded value. Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target: Fix target_wait_for_sess_cmds breakage with active signalsNicholas Bellinger2018-10-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the addition of commit 00d909a10710 ("scsi: target: Make the session shutdown code also wait for commands that are being aborted") in v4.19-rc, it incorrectly assumes no signals will be pending for task_struct executing the normal session shutdown and I/O quiesce code-path. For example, iscsi-target and iser-target issue SIGINT to all kthreads as part of session shutdown. This has been the behaviour since day one. As-is when signals are pending with se_cmds active in se_sess->sess_cmd_list, wait_event_interruptible_lock_irq_timeout() returns a negative number and immediately kills the machine because of the do while (ret <= 0) loop that was added in commit 00d909a107 to spin while backend I/O is taking any amount of extended time (say 30 seconds) to complete. Here's what it looks like in action with debug plus delayed backend I/O completion: [ 4951.909951] se_sess: 000000003e7e08fa before target_wait_for_sess_cmds [ 4951.914600] target_wait_for_sess_cmds: signal_pending: 1 [ 4951.918015] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 0 [ 4951.921639] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 1 [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 2 [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 3 [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 4 [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 5 [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 6 [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 7 [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 8 [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 9 ... followed by the usual RCU CPU stalls and deadlock. There was never a case pre commit 00d909a107 where wait_for_complete(&se_cmd->cmd_wait_comp) was able to be interrupted, so to address this for v4.19+ moving forward go ahead and use wait_event_lock_irq_timeout() instead so new code works with all fabric drivers. Also for commit 00d909a107, fix a minor regression in target_release_cmd_kref() to only wake_up the new se_sess->cmd_list_wq only when shutdown has actually been triggered via se_sess->sess_tearing_down. Fixes: 00d909a10710 ("scsi: target: Make the session shutdown code also wait for commands that are being aborted") Cc: <stable@vger.kernel.org> # v4.19+ Cc: Bart Van Assche <bvanassche@acm.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Tested-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Reviewed-by: Bryant G. Ly <bly@catalogicsoftware.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target: iscsi: cxgbit: fix csk leakVarun Prakash2018-09-251-3/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | csk leak can happen if a new TCP connection gets established after cxgbit_accept_np() returns, to fix this leak free remaining csk in cxgbit_free_np(). Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>