summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* scsi: libsas: Add struct sas_tmf_taskJohn Garry2022-02-1911-56/+49
| | | | | | | | | | | | | | | | | | Some of the LLDDs which use libsas have their own definition of a struct to hold TMF info, so add a common struct for libsas. Also add an interim force phy id field for hisi_sas driver, which will be removed once the STP "TMF" code is factored out. Even though some LLDDs (pm8001) use a u32 for the tag, u16 will be adequate, as that named driver only uses tags in range [0, 1024). Link: https://lore.kernel.org/r/1645112566-115804-8-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: libsas: Move SMP task handlers to coreJohn Garry2022-02-193-22/+29
| | | | | | | | | | | | Move the SMP task handlers to the core host code as they will be re-used for executing internal abort and TMF tasks. Link: https://lore.kernel.org/r/1645112566-115804-7-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: hisi_sas: Delete unused I_T_NEXUS_RESET_PHYUP_TIMEOUTJohn Garry2022-02-191-2/+0
| | | | | | | | | | | There is no user, so delete it. Link: https://lore.kernel.org/r/1645112566-115804-6-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: libsas: Delete SAS_SG_ERRJohn Garry2022-02-193-6/+0
| | | | | | | | | No LLDD sets exec status as SAS_SG_ERR, so remove support. Link: https://lore.kernel.org/r/1645112566-115804-5-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: libsas: Delete lldd_clear_aca callbackJohn Garry2022-02-1915-72/+0
| | | | | | | | | | | | | This callback is never called, so remove support. Link: https://lore.kernel.org/r/1645112566-115804-4-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: libsas: Use enum for response frame DATAPRES fieldJohn Garry2022-02-195-10/+22
| | | | | | | | | | | | | | | | As defined in table 126 of the SAS spec 1.1, use an enum for the DATAPRES field, which makes reading the code easier. Also change sas_ssp_task_response() to use a switch statement, which is more suitable (than if-else), as suggested by Christoph. Link: https://lore.kernel.org/r/1645112566-115804-3-git-send-email-john.garry@huawei.com Suggested-by: Xiang Chen <chenxiang66@hisilicon.com> Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: libsas: Handle non-TMF codes in sas_scsi_find_task()John Garry2022-02-191-2/+4
| | | | | | | | | | | | | | | | | LLDD TMF callbacks may return linux or other error codes instead of TMF codes. This may cause problems in sas_scsi_find_task() -> .lldd_query_task(), as only TMF codes are handled there. As such, we may not return a task_disposition type from sas_scsi_find_task(). Function sas_eh_handle_sas_errors() only handles that type, and will only progress error handling for those recognised types. Return TASK_ABORT_FAILED upon exit on the assumption that the command may still be alive and error handling should be escalated. Link: https://lore.kernel.org/r/1645112566-115804-2-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* Merge branch '5.17/scsi-fixes' into 5.18/scsi-stagingMartin K. Petersen2022-02-1424-120/+161
|\ | | | | | | | | | | | | Pull 5.17 fixes branch into 5.18 tree to resolve a few pm8001 driver merge conflicts. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: lpfc: Reduce log messages seen after firmware downloadJames Smart2022-02-072-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Messages around firmware download were incorrectly tagged as being related to discovery trace events. Thus, firmware download status ended up dumping the trace log as well as the firmware update message. As there were a couple of log messages in this state, the trace log was dumped multiple times. Resolve this by converting from trace events to SLI events. Link: https://lore.kernel.org/r/20220207180442.72836-1-jsmart2021@gmail.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: lpfc: Remove NVMe support if kernel has NVME_FC disabledJames Smart2022-02-072-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The driver is initiating NVMe PRLIs to determine device NVMe support. This should not be occurring if CONFIG_NVME_FC support is disabled. Correct this by changing the default value for FC4 support. Currently it defaults to FCP and NVMe. With change, when NVME_FC support is not enabled in the kernel, the default value is just FCP. Link: https://lore.kernel.org/r/20220207180516.73052-1-jsmart2021@gmail.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: hisi_sas: Fix setting of hisi_sas_slot.is_internalJohn Garry2022-01-311-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | The hisi_sas_slot.is_internal member is not set properly for ATA commands which the driver sends directly. A TMF struct pointer is normally used as a test to set this, but it is NULL for those commands. It's not ideal, but pass an empty TMF struct to set that member properly. Link: https://lore.kernel.org/r/1643627607-138785-1-git-send-email-john.garry@huawei.com Fixes: dc313f6b125b ("scsi: hisi_sas: Factor out task prep and delivery code") Reported-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: pm8001: Fix use-after-free for aborted SSP/STP sas_taskJohn Garry2022-01-311-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently a use-after-free may occur if a sas_task is aborted by the upper layer before we handle the I/O completion in mpi_ssp_completion() or mpi_sata_completion(). In this case, the following are the two steps in handling those I/O completions: - Call complete() to inform the upper layer handler of completion of the I/O. - Release driver resources associated with the sas_task in pm8001_ccb_task_free() call. When complete() is called, the upper layer may free the sas_task. As such, we should not touch the associated sas_task afterwards, but we do so in the pm8001_ccb_task_free() call. Fix by swapping the complete() and pm8001_ccb_task_free() calls ordering. Link: https://lore.kernel.org/r/1643289172-165636-4-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: pm8001: Fix use-after-free for aborted TMF sas_taskJohn Garry2022-01-311-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently a use-after-free may occur if a TMF sas_task is aborted before we handle the IO completion in mpi_ssp_completion(). The abort occurs due to timeout. When the timeout occurs, the SAS_TASK_STATE_ABORTED flag is set and the sas_task is freed in pm8001_exec_internal_tmf_task(). However, if the I/O completion occurs later, the I/O completion still thinks that the sas_task is available. Fix this by clearing the ccb->task if the TMF times out - the I/O completion handler does nothing if this pointer is cleared. Link: https://lore.kernel.org/r/1643289172-165636-3-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: pm8001: Fix warning for undescribed param in process_one_iomb()John Garry2022-01-311-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | make W=1 complains of an undescribed function parameter: drivers/scsi/pm8001/pm80xx_hwi.c:3938: warning: Function parameter or member 'circularQ' not described in 'process_one_iomb' Fix it. Link: https://lore.kernel.org/r/1643289172-165636-2-git-send-email-john.garry@huawei.com Reported-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: core: Reallocate device's budget map on queue depth changeMing Lei2022-01-311-5/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently use ->cmd_per_lun as initial queue depth for setting up the budget_map. Martin Wilck reported that it is common for the queue_depth to be subsequently updated in slave_configure() based on detected hardware characteristics. As a result, for some drivers, the static host template settings for cmd_per_lun and can_queue won't actually get used in practice. And if the default values are used to allocate the budget_map, memory may be consumed unnecessarily. Fix the issue by reallocating the budget_map after ->slave_configure() returns. At that time the device queue_depth should accurately reflect what the hardware needs. Link: https://lore.kernel.org/r/20220127153733.409132-1-ming.lei@redhat.com Cc: Bart Van Assche <bvanassche@acm.org> Reported-by: Martin Wilck <martin.wilck@suse.com> Suggested-by: Martin Wilck <martin.wilck@suse.com> Tested-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: bnx2fc: Make bnx2fc_recv_frame() mp safeJohn Meneghini2022-01-311-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Running tests with a debug kernel shows that bnx2fc_recv_frame() is modifying the per_cpu lport stats counters in a non-mpsafe way. Just boot a debug kernel and run the bnx2fc driver with the hardware enabled. [ 1391.699147] BUG: using smp_processor_id() in preemptible [00000000] code: bnx2fc_ [ 1391.699160] caller is bnx2fc_recv_frame+0xbf9/0x1760 [bnx2fc] [ 1391.699174] CPU: 2 PID: 4355 Comm: bnx2fc_l2_threa Kdump: loaded Tainted: G B [ 1391.699180] Hardware name: HP ProLiant DL120 G7, BIOS J01 07/01/2013 [ 1391.699183] Call Trace: [ 1391.699188] dump_stack_lvl+0x57/0x7d [ 1391.699198] check_preemption_disabled+0xc8/0xd0 [ 1391.699205] bnx2fc_recv_frame+0xbf9/0x1760 [bnx2fc] [ 1391.699215] ? do_raw_spin_trylock+0xb5/0x180 [ 1391.699221] ? bnx2fc_npiv_create_vports.isra.0+0x4e0/0x4e0 [bnx2fc] [ 1391.699229] ? bnx2fc_l2_rcv_thread+0xb7/0x3a0 [bnx2fc] [ 1391.699240] bnx2fc_l2_rcv_thread+0x1af/0x3a0 [bnx2fc] [ 1391.699250] ? bnx2fc_ulp_init+0xc0/0xc0 [bnx2fc] [ 1391.699258] kthread+0x364/0x420 [ 1391.699263] ? _raw_spin_unlock_irq+0x24/0x50 [ 1391.699268] ? set_kthread_struct+0x100/0x100 [ 1391.699273] ret_from_fork+0x22/0x30 Restore the old get_cpu/put_cpu code with some modifications to reduce the size of the critical section. Link: https://lore.kernel.org/r/20220124145110.442335-1-jmeneghi@redhat.com Fixes: d576a5e80cd0 ("bnx2fc: Improve stats update mechanism") Tested-by: Guangwu Zhang <guazhang@redhat.com> Acked-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: pm80xx: Fix double completion for SATA devicesAjish Koshy2022-01-312-44/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current code handles completions for SATA devices in mpi_sata_completion() and mpi_sata_event(). However, at the time when any SATA event happens, for almost all the event types, the command is still in the target. It is therefore incorrect to complete the task in sata_event(). There are some events for which we get sata_completions, some need recovery procedure and others abort. All the tasks must be completed via sata_completion() path. Removed the task done related code from sata_events(). For tasks where we don't get completions, let top layer call abort() to abort the command post timeout. Link: https://lore.kernel.org/r/20220124082255.86223-1-Ajish.Koshy@microchip.com Acked-by: Jack Wang <jinpu.wang@ionos.com> Co-developed-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: myrs: Fix crash in error caseTong Zhang2022-01-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In myrs_detect(), cs->disable_intr is NULL when privdata->hw_init() fails with non-zero. In this case, myrs_cleanup(cs) will call a NULL ptr and crash the kernel. [ 1.105606] myrs 0000:00:03.0: Unknown Initialization Error 5A [ 1.105872] myrs 0000:00:03.0: Failed to initialize Controller [ 1.106082] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 1.110774] Call Trace: [ 1.110950] myrs_cleanup+0xe4/0x150 [myrs] [ 1.111135] myrs_probe.cold+0x91/0x56a [myrs] [ 1.111302] ? DAC960_GEM_intr_handler+0x1f0/0x1f0 [myrs] [ 1.111500] local_pci_probe+0x48/0x90 Link: https://lore.kernel.org/r/20220123225717.1069538-1-ztong0001@gmail.com Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: 53c700: Remove redundant assignment to pointer SCpColin Ian King2022-01-251-1/+0
| | | | | | | | | | | | | | | | | | Pointer SCp is being re-assigned the same value that it was initialized to a few lines earlier, the assignment is redundant and can be removed. Link: https://lore.kernel.org/r/20220123175530.110462-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: ufs: Treat link loss as fatal errorKiwoong Kim2022-01-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This event is raised when link is lost as specified in UFSHCI spec and that means communication is not possible. Thus initializing UFS interface needs to be done. Make UFS driver considers Link Lost as fatal in the INT_FATAL_ERRORS mask. This will trigger a host reset whenever a link lost interrupt occurs. Link: https://lore.kernel.org/r/1642743475-54275-1-git-send-email-kwmad.kim@samsung.com Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: ufs: Use generic error code in ufshcd_set_dev_pwr_mode()Kiwoong Kim2022-01-251-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The return value of ufshcd_set_dev_pwr_mode() is passed to device PM core. However, the function currently returns a SCSI result which the PM core doesn't understand. This might lead to unexpected behaviors in userland; a platform reset was observed in Android. Use a generic error code for SSU failures. Link: https://lore.kernel.org/r/1642743182-54098-1-git-send-email-kwmad.kim@samsung.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: bfa: Remove useless DMA-32 fallback configurationChristophe JAILLET2022-01-241-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32-bit case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lore.kernel.org/linux-kernel/YL3vSPK5DXTNvgdx@infradead.org/#t Link: https://lore.kernel.org/r/5663cef9b54004fa56cca7ce65f51eadfc3ecddb.1642238127.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: hisi_sas: Remove useless DMA-32 fallback configurationChristophe JAILLET2022-01-242-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32-bit case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lore.kernel.org/linux-kernel/YL3vSPK5DXTNvgdx@infradead.org/#t Link: https://lore.kernel.org/r/1bf2d3660178b0e6f172e5208bc0bd68d31d9268.1642237482.git.christophe.jaillet@wanadoo.fr Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: 3w-sas: Remove useless DMA-32 fallback configurationChristophe JAILLET2022-01-241-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32-bit case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lore.kernel.org/linux-kernel/YL3vSPK5DXTNvgdx@infradead.org/#t Link: https://lore.kernel.org/r/dbbe8671ca760972d80f8d35f3170b4609bee368.1642236763.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: bnx2fc: Flush destroy_work queue before calling bnx2fc_interface_put()John Meneghini2022-01-241-15/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bnx2fc_destroy() functions are removing the interface before calling destroy_work. This results multiple WARNings from sysfs_remove_group() as the controller rport device attributes are removed too early. Replace the fcoe_port's destroy_work queue. It's not needed. The problem is easily reproducible with the following steps. Example: $ dmesg -w & $ systemctl enable --now fcoe $ fipvlan -s -c ens2f1 $ fcoeadm -d ens2f1.802 [ 583.464488] host2: libfc: Link down on port (7500a1) [ 583.472651] bnx2fc: 7500a1 - rport not created Yet!! [ 583.490468] ------------[ cut here ]------------ [ 583.538725] sysfs group 'power' not found for kobject 'rport-2:0-0' [ 583.568814] WARNING: CPU: 3 PID: 192 at fs/sysfs/group.c:279 sysfs_remove_group+0x6f/0x80 [ 583.607130] Modules linked in: dm_service_time 8021q garp mrp stp llc bnx2fc cnic uio rpcsec_gss_krb5 auth_rpcgss nfsv4 ... [ 583.942994] CPU: 3 PID: 192 Comm: kworker/3:2 Kdump: loaded Not tainted 5.14.0-39.el9.x86_64 #1 [ 583.984105] Hardware name: HP ProLiant DL120 G7, BIOS J01 07/01/2013 [ 584.016535] Workqueue: fc_wq_2 fc_rport_final_delete [scsi_transport_fc] [ 584.050691] RIP: 0010:sysfs_remove_group+0x6f/0x80 [ 584.074725] Code: ff 5b 48 89 ef 5d 41 5c e9 ee c0 ff ff 48 89 ef e8 f6 b8 ff ff eb d1 49 8b 14 24 48 8b 33 48 c7 c7 ... [ 584.162586] RSP: 0018:ffffb567c15afdc0 EFLAGS: 00010282 [ 584.188225] RAX: 0000000000000000 RBX: ffffffff8eec4220 RCX: 0000000000000000 [ 584.221053] RDX: ffff8c1586ce84c0 RSI: ffff8c1586cd7cc0 RDI: ffff8c1586cd7cc0 [ 584.255089] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffb567c15afc00 [ 584.287954] R10: ffffb567c15afbf8 R11: ffffffff8fbe7f28 R12: ffff8c1486326400 [ 584.322356] R13: ffff8c1486326480 R14: ffff8c1483a4a000 R15: 0000000000000004 [ 584.355379] FS: 0000000000000000(0000) GS:ffff8c1586cc0000(0000) knlGS:0000000000000000 [ 584.394419] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 584.421123] CR2: 00007fe95a6f7840 CR3: 0000000107674002 CR4: 00000000000606e0 [ 584.454888] Call Trace: [ 584.466108] device_del+0xb2/0x3e0 [ 584.481701] device_unregister+0x13/0x60 [ 584.501306] bsg_unregister_queue+0x5b/0x80 [ 584.522029] bsg_remove_queue+0x1c/0x40 [ 584.541884] fc_rport_final_delete+0xf3/0x1d0 [scsi_transport_fc] [ 584.573823] process_one_work+0x1e3/0x3b0 [ 584.592396] worker_thread+0x50/0x3b0 [ 584.609256] ? rescuer_thread+0x370/0x370 [ 584.628877] kthread+0x149/0x170 [ 584.643673] ? set_kthread_struct+0x40/0x40 [ 584.662909] ret_from_fork+0x22/0x30 [ 584.680002] ---[ end trace 53575ecefa942ece ]--- Link: https://lore.kernel.org/r/20220115040044.1013475-1-jmeneghi@redhat.com Fixes: 0cbf32e1681d ("[SCSI] bnx2fc: Avoid calling bnx2fc_if_destroy with unnecessary locks") Tested-by: Guangwu Zhang <guazhang@redhat.com> Co-developed-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: zfcp: Fix failed recovery on gone remote port with non-NPIV FCP devicesSteffen Maier2022-01-241-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Suppose we have an environment with a number of non-NPIV FCP devices (virtual HBAs / FCP devices / zfcp "adapter"s) sharing the same physical FCP channel (HBA port) and its I_T nexus. Plus a number of storage target ports zoned to such shared channel. Now one target port logs out of the fabric causing an RSCN. Zfcp reacts with an ADISC ELS and subsequent port recovery depending on the ADISC result. This happens on all such FCP devices (in different Linux images) concurrently as they all receive a copy of this RSCN. In the following we look at one of those FCP devices. Requests other than FSF_QTCB_FCP_CMND can be slow until they get a response. Depending on which requests are affected by slow responses, there are different recovery outcomes. Here we want to fix failed recoveries on port or adapter level by avoiding recovery requests that can be slow. We need the cached N_Port_ID for the remote port "link" test with ADISC. Just before sending the ADISC, we now intentionally forget the old cached N_Port_ID. The idea is that on receiving an RSCN for a port, we have to assume that any cached information about this port is stale. This forces a fresh new GID_PN [FC-GS] nameserver lookup on any subsequent recovery for the same port. Since we typically can still communicate with the nameserver efficiently, we now reach steady state quicker: Either the nameserver still does not know about the port so we stop recovery, or the nameserver already knows the port potentially with a new N_Port_ID and we can successfully and quickly perform open port recovery. For the one case, where ADISC returns successfully, we re-initialize port->d_id because that case does not involve any port recovery. This also solves a problem if the storage WWPN quickly logs into the fabric again but with a different N_Port_ID. Such as on virtual WWPN takeover during target NPIV failover. [https://www.redbooks.ibm.com/abstracts/redp5477.html] In that case the RSCN from the storage FDISC was ignored by zfcp and we could not successfully recover the failover. On some later failback on the storage, we could have been lucky if the virtual WWPN got the same old N_Port_ID from the SAN switch as we still had cached. Then the related RSCN triggered a successful port reopen recovery. However, there is no guarantee to get the same N_Port_ID on NPIV FDISC. Even though NPIV-enabled FCP devices are not affected by this problem, this code change optimizes recovery time for gone remote ports as a side effect. The timely drop of cached N_Port_IDs prevents unnecessary slow open port attempts. While the problem might have been in code before v2.6.32 commit 799b76d09aee ("[SCSI] zfcp: Decouple gid_pn requests from erp") this fix depends on the gid_pn_work introduced with that commit, so we mark it as culprit to satisfy fix dependencies. Note: Point-to-point remote port is already handled separately and gets its N_Port_ID from the cached peer_d_id. So resetting port->d_id in general does not affect PtP. Link: https://lore.kernel.org/r/20220118165803.3667947-1-maier@linux.ibm.com Fixes: 799b76d09aee ("[SCSI] zfcp: Decouple gid_pn requests from erp") Cc: <stable@vger.kernel.org> #2.6.32+ Suggested-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: pm8001: Fix bogus FW crash for maxcpus=1John Garry2022-01-242-3/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the comment in check_fw_ready() we should not check the IOP1_READY field in register SCRATCH_PAD_1 for 8008 or 8009 controllers. However we check this very field in process_oq() for processing the highest index interrupt vector. The highest interrupt vector is checked as the FW is programmed to signal fatal errors through this irq. Change that function to not check IOP1_READY for those mentioned controllers, but do check ILA_READY in both cases. The reason I assume that this was not hit earlier was because we always allocated 64 MSI(X), and just did not pass the vector index check in process_oq(), i.e. the handler never ran for vector index 63. Link: https://lore.kernel.org/r/1642508105-95432-1-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: qedf: Change context reset messages to ratelimitedSaurav Kashyap2022-01-241-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | If FCoE is not configured, libfc/libfcoe keeps on retrying FLOGI and after 3 retries driver does a context reset and tries fipvlan again. This leads to context reset message flooding the logs. Hence ratelimit the message to prevent flooding the logs. Link: https://lore.kernel.org/r/20220117135311.6256-4-njavali@marvell.com Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: qedf: Fix refcount issue when LOGO is received during TMFSaurav Kashyap2022-01-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hung task call trace was seen during LOGO processing. [ 974.309060] [0000:00:00.0]:[qedf_eh_device_reset:868]: 1:0:2:0: LUN RESET Issued... [ 974.309065] [0000:00:00.0]:[qedf_initiate_tmf:2422]: tm_flags 0x10 sc_cmd 00000000c16b930f op = 0x2a target_id = 0x2 lun=0 [ 974.309178] [0000:00:00.0]:[qedf_initiate_tmf:2431]: portid=016900 tm_flags =LUN RESET [ 974.309222] [0000:00:00.0]:[qedf_initiate_tmf:2438]: orig io_req = 00000000ec78df8f xid = 0x180 ref_cnt = 1. [ 974.309625] host1: rport 016900: Received LOGO request while in state Ready [ 974.309627] host1: rport 016900: Delete port [ 974.309642] host1: rport 016900: work event 3 [ 974.309644] host1: rport 016900: lld callback ev 3 [ 974.313243] [0000:61:00.2]:[qedf_execute_tmf:2383]:1: fcport is uploading, not executing flush. [ 974.313295] [0000:61:00.2]:[qedf_execute_tmf:2400]:1: task mgmt command success... [ 984.031088] INFO: task jbd2/dm-15-8:7645 blocked for more than 120 seconds. [ 984.031136] Not tainted 4.18.0-305.el8.x86_64 #1 [ 984.031166] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 984.031209] jbd2/dm-15-8 D 0 7645 2 0x80004080 [ 984.031212] Call Trace: [ 984.031222] __schedule+0x2c4/0x700 [ 984.031230] ? unfreeze_partials.isra.83+0x16e/0x1a0 [ 984.031233] ? bit_wait_timeout+0x90/0x90 [ 984.031235] schedule+0x38/0xa0 [ 984.031238] io_schedule+0x12/0x40 [ 984.031240] bit_wait_io+0xd/0x50 [ 984.031243] __wait_on_bit+0x6c/0x80 [ 984.031248] ? free_buffer_head+0x21/0x50 [ 984.031251] out_of_line_wait_on_bit+0x91/0xb0 [ 984.031257] ? init_wait_var_entry+0x50/0x50 [ 984.031268] jbd2_journal_commit_transaction+0x112e/0x19f0 [jbd2] [ 984.031280] kjournald2+0xbd/0x270 [jbd2] [ 984.031284] ? finish_wait+0x80/0x80 [ 984.031291] ? commit_timeout+0x10/0x10 [jbd2] [ 984.031294] kthread+0x116/0x130 [ 984.031300] ? kthread_flush_work_fn+0x10/0x10 [ 984.031305] ret_from_fork+0x1f/0x40 There was a ref count issue when LOGO is received during TMF. This leads to one of the I/Os hanging with the driver. Fix the ref count. Link: https://lore.kernel.org/r/20220117135311.6256-3-njavali@marvell.com Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: qedf: Add stag_work to all the vportsSaurav Kashyap2022-01-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Call trace seen when creating NPIV ports, only 32 out of 64 show online. stag work was not initialized for vport, hence initialize the stag work. WARNING: CPU: 8 PID: 645 at kernel/workqueue.c:1635 __queue_delayed_work+0x68/0x80 CPU: 8 PID: 645 Comm: kworker/8:1 Kdump: loaded Tainted: G IOE --------- -- 4.18.0-348.el8.x86_64 #1 Hardware name: Dell Inc. PowerEdge MX740c/0177V9, BIOS 2.12.2 07/09/2021 Workqueue: events fc_lport_timeout [libfc] RIP: 0010:__queue_delayed_work+0x68/0x80 Code: 89 b2 88 00 00 00 44 89 82 90 00 00 00 48 01 c8 48 89 42 50 41 81 f8 00 20 00 00 75 1d e9 60 24 07 00 44 89 c7 e9 98 f6 ff ff <0f> 0b eb c5 0f 0b eb a1 0f 0b eb a7 0f 0b eb ac 44 89 c6 e9 40 23 RSP: 0018:ffffae514bc3be40 EFLAGS: 00010006 RAX: ffff8d25d6143750 RBX: 0000000000000202 RCX: 0000000000000002 RDX: ffff8d2e31383748 RSI: ffff8d25c000d600 RDI: ffff8d2e31383788 RBP: ffff8d2e31380de0 R08: 0000000000002000 R09: ffff8d2e31383750 R10: ffffffffc0c957e0 R11: ffff8d2624800000 R12: ffff8d2e31380a58 R13: ffff8d2d915eb000 R14: ffff8d25c499b5c0 R15: ffff8d2e31380e18 FS: 0000000000000000(0000) GS:ffff8d2d1fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055fd0484b8b8 CR3: 00000008ffc10006 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: queue_delayed_work_on+0x36/0x40 qedf_elsct_send+0x57/0x60 [qedf] fc_lport_enter_flogi+0x90/0xc0 [libfc] fc_lport_timeout+0xb7/0x140 [libfc] process_one_work+0x1a7/0x360 ? create_worker+0x1a0/0x1a0 worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x116/0x130 ? kthread_flush_work_fn+0x10/0x10 ret_from_fork+0x35/0x40 ---[ end trace 008f00f722f2c2ff ]-- Initialize stag work for all the vports. Link: https://lore.kernel.org/r/20220117135311.6256-2-njavali@marvell.com Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: ufs: ufshcd-pltfrm: Check the return value of devm_kstrdup()Xiaoke Wang2022-01-241-0/+7
| | | | | | | | | | | | | | | | | | | | devm_kstrdup() returns pointer to allocated string on success, NULL on failure. So it is better to check the return value of it. Link: https://lore.kernel.org/r/tencent_4257E15D4A94FF9020DDCC4BB9B21C041408@qq.com Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Xiaoke Wang <xkernel.wang@foxmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: target: iscsi: Make sure the np under each tpg is uniqueZouMingzhe2022-01-241-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iscsit_tpg_check_network_portal() has nested for_each loops and is supposed to return true when a match is found. However, the tpg loop will still continue after existing the tpg_np loop. If this tpg_np is not the last the match value will be changed. Break the outer loop after finding a match and make sure the np under each tpg is unique. Link: https://lore.kernel.org/r/20220111054742.19582-1-mingzhe.zou@easystack.cn Signed-off-by: ZouMingzhe <mingzhe.zou@easystack.cn> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: elx: efct: Don't use GFP_KERNEL under spin lockYang Yingliang2022-01-241-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | GFP_KERNEL/GFP_DMA can't be used under a spin lock. According the comment, els_ios_lock is used to protect els ios list so we can move down the spin lock to avoid using this flag under the lock. Link: https://lore.kernel.org/r/20220111012441.3232527-1-yangyingliang@huawei.com Fixes: 8f406ef72859 ("scsi: elx: libefc: Extended link Service I/O handling") Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: mpi3mr: Bump driver version to 8.0.0.68.0Sreekanth Reddy2022-02-111-2/+2
| | | | | | | | | | | | | | | | Bump mpi3mr driver version to 8.0.0.68.0. Link: https://lore.kernel.org/r/20220210095817.22828-10-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: mpi3mr: Fix memory leaksSreekanth Reddy2022-02-111-1/+1
| | | | | | | | | | | | | | | | | | Fix memory leaks related to operational reply queue's memory segments which are not getting freed while unloading the driver. Link: https://lore.kernel.org/r/20220210095817.22828-9-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: mpi3mr: Update the copyright yearSreekanth Reddy2022-02-114-4/+4
| | | | | | | | | | | | | | | | Update the copyright year to 2017-2022. Link: https://lore.kernel.org/r/20220210095817.22828-8-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: mpi3mr: Fix reporting of actual data transfer sizeSreekanth Reddy2022-02-111-0/+2
| | | | | | | | | | | | | | | | | | | | The driver is missing to set the residual size while completing an I/O. Ensure proper data transfer size is reported to the kernel on I/O completion based on the transfer length reported by the firmware. Link: https://lore.kernel.org/r/20220210095817.22828-7-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: mpi3mr: Fix cmnd getting marked as in use foreverSreekanth Reddy2022-02-111-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a driver command which requires the driver to issue a follow up command using the same command frame is outstanding and a soft reset operation occurs, then that driver command frame is getting marked as in use permanently and won't be reused again. Clear the driver command frames while flushing out the outstanding commands and avoid issuing any new requests using these command frames while soft reset is going on. Link: https://lore.kernel.org/r/20220210095817.22828-6-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: mpi3mr: Fix hibernation issueSreekanth Reddy2022-02-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Hibernation operation fails when it is issued for second time. This is because the driver is trying to release the IOC's PCI resources after setting power state to D3. Set the IOC's power state to D3 only after releasing the IOC's PCI resources. Link: https://lore.kernel.org/r/20220210095817.22828-5-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: mpi3mr: Update MPI3 headersSreekanth Reddy2022-02-117-101/+116
| | | | | | | | | | | | | | | | Update MPI3 headers. Link: https://lore.kernel.org/r/20220210095817.22828-4-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: mpi3mr: Fix printing of pending I/O countSreekanth Reddy2022-02-111-1/+1
| | | | | | | | | | | | | | | | Print proper pending I/O count after issuing target reset TM operation. Link: https://lore.kernel.org/r/20220210095817.22828-3-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: mpi3mr: Fix deadlock while canceling the fw eventSreekanth Reddy2022-02-113-21/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During controller reset, the driver tries to flush all the pending firmware event works from worker queue that are queued prior to the reset. However, if any work is waiting for device addition/removal operation to be completed at the SML, then a deadlock is observed. This is due to the controller reset waiting for the device addition/removal to be completed and the device/addition removal is waiting for the controller reset to be completed. To limit this deadlock, continue with the controller reset handling without canceling the work which is waiting for device addition/removal operation to complete at SML. Link: https://lore.kernel.org/r/20220210095817.22828-2-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: Remove unused member cmd_pool for structure scsi_host_templateXiang Chen2022-02-111-3/+0
| | | | | | | | | | | | | | | | | | | | | | After commit e9c787e65c0c ("scsi: allocate scsi_cmnd structures as part of struct request"), the member cmd_pool in structure scsi_host_template is not used, so remove it. Link: https://lore.kernel.org/r/1644561778-183074-5-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: libsas: Remove unused parameter for function sas_ata_eh()Xiang Chen2022-02-113-7/+4
| | | | | | | | | | | | | | | | | | | | Input parameter work_q is not unused in function sas_ata_eh(), so remove it. Link: https://lore.kernel.org/r/1644561778-183074-4-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: libsas: Remove duplicated setting for task->task_state_flagsXiang Chen2022-02-111-1/+0
| | | | | | | | | | | | | | | | | | | | Task->task_state_flags is already set in function sas_alloc_task(), so remove duplicated setting. Link: https://lore.kernel.org/r/1644561778-183074-3-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: libsas: Use void for sas_discover_event() return codeXiang Chen2022-02-112-5/+3
| | | | | | | | | | | | | | | | | | | | The callers of function sas_discover_event() do not check its return value. The function also only ever returns 0, so use void instead. Link: https://lore.kernel.org/r/1644561778-183074-2-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: message: fusion: Use GFP_KERNELJulia Lawall2022-02-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | Pci_driver probe functions aren't called with locks held and thus don't need GFP_ATOMIC. Use GFP_KERNEL instead. Problem found with Coccinelle. Link: https://lore.kernel.org/r/20220210204223.104181-9-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: smartpqi: Fix unused variable pqi_pm_ops for clangDon Brace2022-02-111-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Driver added a new dev_pm_ops structure used only if CONFIG_PM is set. The CONFIG_PM MACRO needed to be moved up in the code to avoid the compiler warnings. The hunk to move the location was missing from the above patch. Found by kernel test robot by building driver with CONFIG_PM disabled. Link: https://lore.kernel.org/all/202202090657.bstNLuce-lkp@intel.com/ Link: https://lore.kernel.org/r/20220210201151.236170-1-don.brace@microchip.com Fixes: c66e078ad89e ("scsi: smartpqi: Fix hibernate and suspend") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike Mcgowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: libsas: Drop SAS_TASK_AT_INITIATORJohn Garry2022-02-1112-55/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This flag is now only ever set, so delete it. This also avoids a use-after-free in the pm8001 queue path, as reported in the following: https://lore.kernel.org/linux-scsi/c3cb7228-254e-9584-182b-007ac5e6fe0a@huawei.com/T/#m28c94c6d3ff582ec4a9fa54819180740e8bd4cfb https://lore.kernel.org/linux-scsi/0cc0c435-b4f2-9c76-258d-865ba50a29dd@huawei.com/ [mkp: checkpatch + two SAS_TASK_AT_INITIATOR references] Link: https://lore.kernel.org/r/1644489804-85730-3-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: isci: Drop SAS_TASK_AT_INITIATOR check in isci_task_abort_task()John Garry2022-02-113-16/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the queue path, move around when we assign sas_task->lldd_task such that this pointer and the SAS_TASK_AT_INITIATOR flag are set atomically. It is also not required to clear SAS_TASK_AT_INITIATOR in isci_task_execute_task() error path as it is also cleared immediately after in isci_task_refuse() call. Now the following items may be considered: - SAS_TASK_STATE_DONE and SAS_TASK_AT_INITIATOR are mutually exclusive apart from possibly when SAS_TASK_STATE_DONE is set in sas_scsi_find_task(), but that is after .lldd_abort_task, i.e. the considered callback, is called. - If isci_task_refuse() is called in the queue path, then sas_task->lldd_task and SAS_TASK_AT_INITIATOR are cleared atomically in isci_task_refuse(). - In the completion path, SAS_TASK_STATE_DONE is set and SAS_TASK_AT_INITIATOR is cleared atomically before the sas_task.lldd_task is cleared later. So in isci_task_abort_task() if SAS_TASK_STATE_DONE is not set and sas_task.lldd_task is still set, then SAS_TASK_AT_INITIATOR must be set - so we can drop this check on SAS_TASK_AT_INITIATOR. [mkp: checkpatch] Link: https://lore.kernel.org/r/1644489804-85730-2-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>