summaryrefslogtreecommitdiffstats
path: root/drivers/scsi/lpfc
Commit message (Collapse)AuthorAgeFilesLines
...
* | scsi: lpfc: Correct maximum PCI function value for RAS fw loggingJustin Tee2023-11-151-2/+2
|/ | | | | | | | | | | | | | | | | Currently, the ras_fwlog_func sysfs parameter allows users to input a value greater than three when selecting a PCI function to enable RAS fw logging feature. The user's input is sanity checked in lpfc_sli4_ras_init(), but allowing an input greater than three doesn't make sense because the max number of ports per HBA is four. Change the allowable range from [0, 7] to [0, 3]. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231031191224.150862-2-justintee8345@gmail.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2023-11-027-15/+48
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull SCSI updates from James Bottomley: "Updates to the usual drivers (ufs, megaraid_sas, lpfc, target, ibmvfc, scsi_debug) plus the usual assorted minor fixes and updates. The major change this time around is a prep patch for rethreading of the driver reset handler API not to take a scsi_cmd structure which starts to reduce various drivers' dependence on scsi_cmd in error handling" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (132 commits) scsi: ufs: core: Leave space for '\0' in utf8 desc string scsi: ufs: core: Conversion to bool not necessary scsi: ufs: core: Fix race between force complete and ISR scsi: megaraid: Fix up debug message in megaraid_abort_and_reset() scsi: aic79xx: Fix up NULL command in ahd_done() scsi: message: fusion: Initialize return value in mptfc_bus_reset() scsi: mpt3sas: Fix loop logic scsi: snic: Remove useless code in snic_dr_clean_pending_req() scsi: core: Add comment to target_destroy in scsi_host_template scsi: core: Clean up scsi_dev_queue_ready() scsi: pmcraid: Add missing scsi_device_put() in pmcraid_eh_target_reset_handler() scsi: target: core: Fix kernel-doc comment scsi: pmcraid: Fix kernel-doc comment scsi: core: Handle depopulation and restoration in progress scsi: ufs: core: Add support for parsing OPP scsi: ufs: core: Add OPP support for scaling clocks and regulators scsi: ufs: dt-bindings: common: Add OPP table scsi: scsi_debug: Add param to control sdev's allow_restart scsi: scsi_debug: Add debugfs interface to fail target reset scsi: scsi_debug: Add new error injection type: Reset LUN failed ...
| * Merge patch series "lpfc: Update lpfc to revision 14.2.0.15"Martin K. Petersen2023-10-137-15/+48
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Justin Tee <justintee8345@gmail.com> says: Update lpfc to revision 14.2.0.15 This patch set contains error handling fixes, ELS bug fixes, and logging improvements. The patches were cut against Martin's 6.7/scsi-queue tree. Link: https://lore.kernel.org/r/20231009161812.97232-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Update lpfc version to 14.2.0.15Justin Tee2023-10-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Update lpfc version to 14.2.0.15. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Introduce LOG_NODE_VERBOSE messaging flagJustin Tee2023-10-132-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The preexisting LOG_NODE message flag frequently spams a subset of the same log messages during normal FC driver operations. When analyzing driver logs, this sometimes leads to difficulty in troubleshooting. Because LOG_IP log message flag is unused, convert it to a new LOG_NODE_VERBOSE flag. The LOG_NODE_VERBOSE shall specifically be used for diagnosing issues that require precise ndlp tracking detail. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Validate ELS LS_ACC completion payloadJustin Tee2023-10-131-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A WCQE success completion status does not guarantee valid LS_ACC receipt for ELS commands. So, introduce a small helper routine that validates ELS LS_ACC frames in ELS cmpl routines. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Reject received PRLIs with only initiator fcn role for NPIV portsJustin Tee2023-10-131-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, NPIV ports send PRLI_ACC to all received unsolicited PRLI requests. For an NPIV port, there is no point to PRLI_ACC if the received PRLI request has the initiator function bit set and the target function bit unset. Modify the lpfc_rcv_prli_support_check() routine to send a PRLI_RJT in such cases. NPIV ports are expected to send PRLI_ACC only if the Target function bit is set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-4-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Treat IOERR_SLI_DOWN I/O completion status the same as pci offlineJustin Tee2023-10-131-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During receipt of a hardware error attention ACQE, IOERR_SLI_DOWN status is set by the driver for all outstanding I/Os. In such hardware error attention cases, we can treat the situation exactly the same as pci_channel_offline. Thus, add IOERR_SLI_DOWN status to the same category as pci_channel_offline handling in lpfc_nvme_io_cmd_cmpl. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Remove unnecessary zero return code assignment in ↵Justin Tee2023-10-131-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lpfc_sli4_hba_setup In order to enter the !rc if statement block in question, rc had to have been zero to begin with. Thus, the rc = 0 assignment is unnecessary and can be removed. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | Merge tag 'x86-core-2023-10-29-v2' of ↵Linus Torvalds2023-10-301-6/+2
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Thomas Gleixner: - Limit the hardcoded topology quirk for Hygon CPUs to those which have a model ID less than 4. The newer models have the topology CPUID leaf 0xB correctly implemented and are not affected. - Make SMT control more robust against enumeration failures SMT control was added to allow controlling SMT at boottime or runtime. The primary purpose was to provide a simple mechanism to disable SMT in the light of speculation attack vectors. It turned out that the code is sensible to enumeration failures and worked only by chance for XEN/PV. XEN/PV has no real APIC enumeration which means the primary thread mask is not set up correctly. By chance a XEN/PV boot ends up with smp_num_siblings == 2, which makes the hotplug control stay at its default value "enabled". So the mask is never evaluated. The ongoing rework of the topology evaluation caused XEN/PV to end up with smp_num_siblings == 1, which sets the SMT control to "not supported" and the empty primary thread mask causes the hotplug core to deny the bringup of the APS. Make the decision logic more robust and take 'not supported' and 'not implemented' into account for the decision whether a CPU should be booted or not. - Fake primary thread mask for XEN/PV Pretend that all XEN/PV vCPUs are primary threads, which makes the usage of the primary thread mask valid on XEN/PV. That is consistent with because all of the topology information on XEN/PV is fake or even non-existent. - Encapsulate topology information in cpuinfo_x86 Move the randomly scattered topology data into a separate data structure for readability and as a preparatory step for the topology evaluation overhaul. - Consolidate APIC ID data type to u32 It's fixed width hardware data and not randomly u16, int, unsigned long or whatever developers decided to use. - Cure the abuse of cpuinfo for persisting logical IDs. Per CPU cpuinfo is used to persist the logical package and die IDs. That's really not the right place simply because cpuinfo is subject to be reinitialized when a CPU goes through an offline/online cycle. Use separate per CPU data for the persisting to enable the further topology management rework. It will be removed once the new topology management is in place. - Provide a debug interface for inspecting topology information Useful in general and extremly helpful for validating the topology management rework in terms of correctness or "bug" compatibility. * tag 'x86-core-2023-10-29-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/apic, x86/hyperv: Use u32 in hv_snp_boot_ap() too x86/cpu: Provide debug interface x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids x86/apic: Use u32 for wakeup_secondary_cpu[_64]() x86/apic: Use u32 for [gs]et_apic_id() x86/apic: Use u32 for phys_pkg_id() x86/apic: Use u32 for cpu_present_to_apicid() x86/apic: Use u32 for check_apicid_used() x86/apic: Use u32 for APIC IDs in global data x86/apic: Use BAD_APICID consistently x86/cpu: Move cpu_l[l2]c_id into topology info x86/cpu: Move logical package and die IDs into topology info x86/cpu: Remove pointless evaluation of x86_coreid_bits x86/cpu: Move cu_id into topology info x86/cpu: Move cpu_core_id into topology info hwmon: (fam15h_power) Use topology_core_id() scsi: lpfc: Use topology_core_id() x86/cpu: Move cpu_die_id into topology info x86/cpu: Move phys_proc_id into topology info x86/cpu: Encapsulate topology information in cpuinfo_x86 ...
| * | scsi: lpfc: Use topology_core_id()Thomas Gleixner2023-10-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the provided topology helper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.446856860@linutronix.de
| * | x86/cpu: Move phys_proc_id into topology infoThomas Gleixner2023-10-101-5/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename it to pkg_id which is the terminology used in the kernel. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.329006989@linutronix.de
* | scsi: lpfc: Prevent use-after-free during rmmod with mapped NVMe rportsJustin Tee2023-09-132-8/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During rmmod, when dev_loss_tmo callback is called, an ndlp kref count is decremented twice. Once for SCSI transport registration and second to remove the initial node allocation kref. If there is also an NVMe transport registration, another reference count decrement is expected in lpfc_nvme_unregister_port(). Race conditions between the NVMe transport remoteport_delete and dev_loss_tmo callbacks sometimes results in premature ndlp object release resulting in use-after-free issues. Fix by not dropping the ndlp object in dev_loss_tmo callback with an outstanding NVMe transport registration. Inversely, mark the final NLP_DROPPED flag in lpfc_nvme_unregister_port when rmmod flag is set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230908211923.37603-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: lpfc: Early return after marking final NLP_DROPPED flag in dev_loss_tmoJustin Tee2023-09-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a dev_loss_tmo event occurs, an ndlp lock is taken before checking nlp_flag for NLP_DROPPED. There is an attempt to restore the ndlp lock when exiting the if statement, but the nlp_put kref could be the final decrement causing a use-after-free memory access on a released ndlp object. Instead of trying to reacquire the ndlp lock after checking nlp_flag, just return after calling nlp_put. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230908211852.37576-1-justintee8345@gmail.com Reviewed-by: "Ewan D. Milne" <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: lpfc: Fix the NULL vs IS_ERR() bug for debugfs_create_file()Jinjie Ruan2023-09-131-7/+7
|/ | | | | | | | | | | | | | | | Since debugfs_create_file() returns ERR_PTR and never NULL, use IS_ERR() to check the return value. Fixes: 2fcbc569b9f5 ("scsi: lpfc: Make debugfs ktime stats generic for NVME and SCSI") Fixes: 4c47efc140fa ("scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures") Fixes: 6a828b0f6192 ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues") Fixes: 95bfc6d8ad86 ("scsi: lpfc: Make FW logging dynamically configurable") Fixes: 9f77870870d8 ("scsi: lpfc: Add debugfs support for cm framework buffers") Fixes: c490850a0947 ("scsi: lpfc: Adapt partitioned XRI lists to efficient sharing") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://lore.kernel.org/r/20230906030809.2847970-1-ruanjinjie@huawei.com Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* Merge branch 'fixes' into miscJames Bottomley2023-09-022-17/+5
|\
| * scsi: lpfc: Remove reftag check in DIF pathsJustin Tee2023-08-071-17/+3
| | | | | | | | | | | | | | | | | | | | | | | | When preparing protection DIF I/O for DMA, the driver obtains reference tags from scsi_prot_ref_tag(). Previously, there was a wrong assumption that an all 0xffffffff value meant error and thus the driver failed the I/O. This patch removes the evaluation code and accepts whatever the upper layer returns. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230803211932.155745-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * Merge branch '6.5/scsi-staging' into 6.5/scsi-fixesMartin K. Petersen2023-07-111-0/+2
| |\ | | | | | | | | | | | | | | | Pull in the currently staged SCSI fixes for 6.5. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Fix a possible data race in lpfc_unregister_fcf_rescan()Tuo Li2023-07-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The variable phba->fcf.fcf_flag is often protected by the lock phba->hbalock() when is accessed. Here is an example in lpfc_unregister_fcf_rescan(): spin_lock_irq(&phba->hbalock); phba->fcf.fcf_flag |= FCF_INIT_DISC; spin_unlock_irq(&phba->hbalock); However, in the same function, phba->fcf.fcf_flag is assigned with 0 without holding the lock, and thus can cause a data race: phba->fcf.fcf_flag = 0; To fix this possible data race, a lock and unlock pair is added when accessing the variable phba->fcf.fcf_flag. Reported-by: BassCheck <bass@buaa.edu.cn> Signed-off-by: Tuo Li <islituo@gmail.com> Link: https://lore.kernel.org/r/20230630024748.1035993-1-islituo@gmail.com Reviewed-by: Justin Tee <justin.tee@broadcom.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Do not abuse UUID APIs and LPFC_COMPRESS_VMID_SIZEAndy Shevchenko2023-08-212-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The lpfc_vmid_host_uuid is not defined as uuid_t and its usage is not the same as for uuid_t operations (like exporting or importing). Hence replace call to uuid_is_null() by respective memchr_inv() without abusing casting. With that, replace LPFC_COMPRESS_VMID_SIZE with plain number and respective sizeof() to make code robust to changes in the future, if any. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230818155452.875781-1-andriy.shevchenko@linux.intel.com Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Modify when a node should be put in device recovery mode during RSCNJustin Tee2023-08-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only nodes whose state is at least past a PLOGI issue and strictly less than a PRLI issue should be put into device recovery mode upon RSCN receipt. Previously, the allowance of LOGO and PRLI completion states did not make sense because those nodes should be allowed to flow through and marked as NPort dissappeared as is normally done. A follow up RSCN GID_FT would recover those nodes in such cases. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230804195546.157839-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Copyright updates for 14.2.0.14 patchesJustin Tee2023-07-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Update copyrights to 2023 for files modified in the 14.2.0.14 patch set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-13-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Update lpfc version to 14.2.0.14Justin Tee2023-07-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Update lpfc version to 14.2.0.14 Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-12-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Clean up SLI-4 sysfs resource reportingJustin Tee2023-07-231-35/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we have dated logic to work around the differences between SLI-4 and SLI-3 resource reporting through sysfs. Leave the SLI-3 path untouched, but for SLI4 path, retrieve resource values from the phba->sli4_hba->max_cfg_param structure. Max values are populated during ACQE events right after READ_CONFIG mbox cmd is sent. Instead of the dated subtraction logic, used resource calculation is directly fed into sysfs for display. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-11-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Refactor cpu affinity assignment pathsJustin Tee2023-07-233-28/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During initialization, a lot of the same logic is used on MSI-X vector CPU affinity assignment. Create a lpfc_next_present_cpu() helper routine, and apply its usage for refactoring purposes. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-10-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Abort outstanding ELS cmds when mailbox timeout error is detectedJustin Tee2023-07-234-11/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A mailbox timeout error usually indicates something has gone wrong, and a follow up reset of the HBA is a typical recovery mechanism. Introduce a MBX_TMO_ERR flag to detect such cases and have lpfc_els_flush_cmd abort ELS commands if the MBX_TMO_ERR flag condition was set. This ensures all of the registered SGL resources meant for ELS traffic are not leaked after an HBA reset. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-9-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Make fabric zone discovery more robust when handling unsolicited ↵Justin Tee2023-07-233-37/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LOGO This patch provides better target rport recovery when a target rport is running in initiator mode to discover the fabric. Such a target will issue a LOGO before switching back to strict target mode and changes are made to recover the login. Log messages are also updated accordingly. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-8-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Set Establish Image Pair service parameter only for Target FunctionsJustin Tee2023-07-233-15/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, Establish Image Pair was set in all PRLI_ACC responses regardless if the received PRLI was from an initiator or target function. Specific target vendors that can operate in both initiator and target mode, may view the PRLI_ACC with Establish Image Pair set as an invalid service parameter when operating in initiator only mode. This causes discovery issues later when the target switches on its target mode function. Revise logic that determines an rport's role as an initiator or target and set the Establish Image Pair service parameter bit only if the Target Function bit is set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Revise ndlp kref handling for dev_loss_tmo_callbk and lpfc_drop_nodeJustin Tee2023-07-232-25/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ndlp kref count implementation in lpfc_dev_loss_tmo_callbk() removes the initial node reference when a vport is unloading. When lpfc_cleanup() sends a DEVICE_RM event and is in NPR state, the driver calls lpfc_drop_node(). Subsequently, lpfc_drop_node() also removes an ndlp kref thinking it is the initial reference. This unintentionally introduces an extra kref decrement on the ndlp object. Fix by using the NLP_DROPPED node flag in lpfc_dev_loss_tmo_callbk() and lpfc_drop_node() to coordinate the removal of the initial node reference. In lpfc_dev_loss_tmo_callbk(), remove the SCSI transport reference provided the node is registered in the dev_loss context because the driver cannot call the SCSI transport in dev_loss context or afterwards. And, have lpfc_drop_node() not remove a reference if another thread is acting or has already acted on it. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Qualify ndlp discovery state when processing RSCNJustin Tee2023-07-231-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conditionalize when to put an ndlp into recovery mode when processing RSCNs. As long as an ndlp state is beyond a PLOGI issue and has been mapped to a transport layer before, the ndlp qualifies to be put into recovery mode. Otherwise, treat the ndlp rport normally through the discovery engine. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Remove extra ndlp kref decrement in FLOGI cmpl for loop topologyJustin Tee2023-07-231-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In lpfc_cmpl_els_flogi(), the return out: label decrements the ndlp kref signaling that FLOGI processing on the ndlp is complete. In loop topology path, there is an unnecessary ndlp put because it also branches to the out: label. This also signals ndlp usage completion too soon. As such, remove the extra lpfc_nlp_put() when in loop topology. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-4-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Simplify fcp_abort transport callback log messageJustin Tee2023-07-231-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The driver is reaching into a nvme_fc_cmd_iu ptr that belongs to the transport during an abort. This could cause an unintentional ptr dereference into memory that the driver does not own. Since the nvme_fc_cmd_iu ptr was for logging purposes only, simplify the log message such that the nvme_fc_cmd_iu reference is no longer needed. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | scsi: lpfc: Pull out fw diagnostic dump log message from driver's trace bufferJustin Tee2023-07-231-1/+1
|/ / | | | | | | | | | | | | | | | | | | The firmware diagnostic dump log message does not need to be a part of the driver's log trace buffer because it is an expected user triggered event. Change LOG_TRACE_EVENT verbose flag to LOG_SLI. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2023-07-081-5/+5
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull more SCSI updates from James Bottomley: "A few late arriving patches that missed the initial pull request. It's mostly bug fixes (the dt-bindings is a fix for the initial pull)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Remove unused function declaration scsi: target: docs: Remove tcm_mod_builder.py scsi: target: iblock: Quiet bool conversion warning with pr_preempt use scsi: dt-bindings: ufs: qcom: Fix ICE phandle scsi: core: Simplify scsi_cdl_check_cmd() scsi: isci: Fix comment typo scsi: smartpqi: Replace one-element arrays with flexible-array members scsi: target: tcmu: Replace strlcpy() with strscpy() scsi: ncr53c8xx: Replace strlcpy() with strscpy() scsi: lpfc: Fix lpfc_name struct packing
| * scsi: lpfc: Fix lpfc_name struct packingArnd Bergmann2023-06-211-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | clang points out that the lpfc_name structure has an 8-byte alignment requirement on most architectures, but is embedded in a number of other structures that are forced to be only 1-byte aligned: drivers/scsi/lpfc/lpfc_hw.h:1516:30: error: field pe within 'struct lpfc_fdmi_reg_port_list' is less aligned than 'struct lpfc_fdmi_port_entry' and is usually due to 'struct lpfc_fdmi_reg_port_list' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access] struct lpfc_fdmi_port_entry pe; drivers/scsi/lpfc/lpfc_hw.h:850:19: error: field portName within 'struct _ADISC' is less aligned than 'struct lpfc_name' and is usually due to 'struct _ADISC' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access] drivers/scsi/lpfc/lpfc_hw.h:851:19: error: field nodeName within 'struct _ADISC' is less aligned than 'struct lpfc_name' and is usually due to 'struct _ADISC' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access] drivers/scsi/lpfc/lpfc_hw.h:922:19: error: field portName within 'struct _RNID' is less aligned than 'struct lpfc_name' and is usually due to 'struct _RNID' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access] drivers/scsi/lpfc/lpfc_hw.h:923:19: error: field nodeName within 'struct _RNID' is less aligned than 'struct lpfc_name' and is usually due to 'struct _RNID' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access] From the git history, I can see that all the __packed annotations were done specifically to avoid introducing implicit padding around the lpfc_name instances, though this was probably the wrong approach. To improve this, only annotate the one uint64_t field inside of lpfc_name as packed, with an explicit 4-byte alignment, as is the default already on the 32-bit x86 ABI but not on most others. With this, the other __packed annotations can be removed again, as this avoids the incorrect padding. Two other structures change their layout as a result of this change: - struct _LOGO never gained a __packed annotation even though it has the same alignment problem as the others but is not used anywhere in the driver today. - struct serv_param similarly has this issue, and it is used, my guess is that this is only an internal structure rather than part of a binary interface, so the padding has no negative effect here. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230616090705.2623408-1-arnd@kernel.org Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2023-06-3017-613/+566
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull SCSI updates from James Bottomley: "Updates to the usual drivers (ufs, pm80xx, libata-scsi, smartpqi, lpfc, qla2xxx). We have a couple of major core changes impacting other systems: - Command Duration Limits, which spills into block and ATA - block level Persistent Reservation Operations, which touches block, nvme, target and dm Both of these are added with merge commits containing a cover letter explaining what's going on" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (187 commits) scsi: core: Improve warning message in scsi_device_block() scsi: core: Replace scsi_target_block() with scsi_block_targets() scsi: core: Don't wait for quiesce in scsi_device_block() scsi: core: Don't wait for quiesce in scsi_stop_queue() scsi: core: Merge scsi_internal_device_block() and device_block() scsi: sg: Increase number of devices scsi: bsg: Increase number of devices scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue scsi: ufs: ufs-pci: Add support for Intel Arrow Lake scsi: sd: sd_zbc: Use PAGE_SECTORS_SHIFT scsi: ufs: wb: Add explicit flush_threshold sysfs attribute scsi: ufs: ufs-qcom: Switch to the new ICE API scsi: ufs: dt-bindings: qcom: Add ICE phandle scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_RTC quirk scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_INTR quirk scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_RTC scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_INTR scsi: ufs: core: Remove dedicated hwq for dev command scsi: ufs: core: mcq: Fix the incorrect OCS value for the device command scsi: ufs: dt-bindings: samsung,exynos: Drop unneeded quotes ...
| * scsi: lpfc: Avoid -Wstringop-overflow warningGustavo A. R. Silva2023-06-071-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prevent any potential integer wrapping issue, and avoid a -Wstringop-overflow warning by using the check_mul_overflow() helper. drivers/scsi/lpfc/lpfc.h: 837:#define LPFC_RAS_MIN_BUFF_POST_SIZE (256 * 1024) drivers/scsi/lpfc/lpfc_debugfs.c: 2266 size = LPFC_RAS_MIN_BUFF_POST_SIZE * phba->cfg_ras_fwlog_buffsize; this can wrap to negative if cfg_ras_fwlog_buffsize is large enough. And even when in practice this is not possible (due to phba->cfg_ras_fwlog_buffsize never being larger than 4[1]), the compiler is legitimately warning us about potentially buggy code. Fix the following warning seen under GCC-13: In function ‘lpfc_debugfs_ras_log_data’, inlined from ‘lpfc_debugfs_ras_log_open’ at drivers/scsi/lpfc/lpfc_debugfs.c:2271:15: drivers/scsi/lpfc/lpfc_debugfs.c:2210:25: warning: ‘memcpy’ specified bound between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=] 2210 | memcpy(buffer + copied, dmabuf->virt, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2211 | size - copied - 1); | ~~~~~~~~~~~~~~~~~~ Link: https://github.com/KSPP/linux/issues/305 Link: https://lore.kernel.org/linux-hardening/CABPRKS8zyzrbsWt4B5fp7kMowAZFiMLKg5kW26uELpg1cDKY3A@mail.gmail.com/ [1] Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/ZHkseX6TiFahvxJA@work Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: lpfc: Use struct_size() helperJustin Tee2023-06-071-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prefer struct_size() over open-coded versions of idiom: sizeof(struct-with-flex-array) + sizeof(typeof-flex-array-elements) * count where count is the max number of items the flexible array is supposed to contain. Link: https://github.com/KSPP/linux/issues/160 Co-developed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230531223319.24328-1-justintee8345@gmail.com Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: lpfc: Fix incorrect big endian type assignments in FDMI and VMID pathsJustin Tee2023-05-312-52/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel test robot reported sparse warnings regarding the improper usage of beXX_to_cpu() macros. Change the flagged FDMI and VMID member variables to __beXX and redo the beXX_to_cpu() macros appropriately. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230530191405.21580-1-justintee8345@gmail.com Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202305261159.lTW5NYrv-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202305260751.NWFvhLY5-lkp@intel.com/ Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * Merge patch series "lpfc: Update lpfc to revision 14.2.0.13"Martin K. Petersen2023-05-3111-320/+195
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Justin Tee <justintee8345@gmail.com> says: Update lpfc to revision 14.2.0.13 This patch set contains discovery bug fixes, firmware logging improvements, clean up of CQ handling, and statistics collection enhancements. The patches were cut against Martin's 6.5/scsi-queue tree. Link: https://lore.kernel.org/r/20230523183206.7728-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Copyright updates for 14.2.0.13 patchesJustin Tee2023-05-312-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Update copyrights to 2023 for files modified in the 14.2.0.13 patch set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230523183206.7728-10-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Update lpfc version to 14.2.0.13Justin Tee2023-05-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Update lpfc version to 14.2.0.13 Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230523183206.7728-9-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Enhance congestion statistics collectionJustin Tee2023-05-312-208/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various improvements are made for collecting congestion statistics: - Pre-existing logic is replaced with use of an hrtimer for increased reporting accuracy. - Congestion timestamp information is reorganized into a single struct. - Common statistic collection logic is refactored into a helper routine. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230523183206.7728-8-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Clean up SLI-4 CQE status handlingJustin Tee2023-05-315-50/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is mishandling of SLI-4 CQE status values larger than what is allowed by the LPFC_IOCB_STATUS_MASK of 4 bits. The LPFC_IOCB_STATUS_MASK is a leftover SLI-3 construct and serves no purpose in SLI-4 path. Remove the LPFC_IOCB_STATUS_MASK and clean up general CQE status handling in SLI-4 completion paths. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230523183206.7728-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Change firmware upgrade logging to KERN_NOTICE instead of ↵Justin Tee2023-05-313-41/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TRACE_EVENT A firmware upgrade does not necessitate dumping of phba->dbg_log[] to kmsg via LOG_TRACE_EVENT. A simple KERN_NOTICE log message should suffice to notify the user of successful or unsuccessful firmware upgrade. As such, firmware upgrade log messages are updated to use KERN_NOTICE instead of LOG_TRACE_EVENT. Additionally, in order to notify the user of reset type for instantiating newly downloaded firmware, lpfc_log_msg's default KERN_LEVEL is updated to 5 or KERN_NOTICE. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230523183206.7728-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Revise NPIV ELS unsol rcv cmpl logic to drop ndlp based on nlp_stateJustin Tee2023-05-311-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When NPIV ports are zoned to devices that support both initiator and target mode, a remote device's initiated PRLI results in unintended final kref clean up of the device's ndlp structure. This disrupts NPIV ports' discovery for target devices that support both initiator and target mode. Modify the NPIV lpfc_drop_node clause such that we allow the ndlp to live so long as it was in NLP_STE_PLOGI_ISSUE, NLP_STE_REG_LOGIN_ISSUE, or NLP_STE_PRLI_ISSUE nlp_state. This allows lpfc's issued PRLI completion routine to determine if the final kref clean up should execute rather than a remote device's issued PRLI. Fixes: db651ec22524 ("scsi: lpfc: Correct used_rpi count when devloss tmo fires with no recovery") Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230523183206.7728-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Account for fabric domain ctlr device loss recoveryJustin Tee2023-05-311-5/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pre-existing device loss recovery logic via the NLP_IN_RECOV_POST_DEV_LOSS flag only handled Fabric Port Login, Fabric Controller, Management, and Name Server addresses. Fabric domain controllers fall under the same category for usage of the NLP_IN_RECOV_POST_DEV_LOSS flag. Add a default case statement to mark an ndlp for device loss recovery. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230523183206.7728-4-justintee8345@gmail.com Acked-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Clear NLP_IN_DEV_LOSS flag if already in rediscoveryJustin Tee2023-05-311-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In dev_loss_tmo callback routine, we early return if the ndlp is in a state of rediscovery. This occurs when a target proactively PLOGIs or PRLIs after an RSCN before the dev_loss_tmo callback routine is scheduled to run. Move clear of the NLP_IN_DEV_LOSS flag before the ndlp state check in such cases. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230523183206.7728-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * scsi: lpfc: Fix use-after-free rport memory access in ↵Justin Tee2023-05-311-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lpfc_register_remote_port() Due to a target port D_ID swap, it is possible for the lpfc_register_remote_port() routine to touch post mortem fc_rport memory when trying to access fc_rport->dd_data. The D_ID swap causes a simultaneous call to lpfc_unregister_remote_port(), where fc_remote_port_delete() reclaims fc_rport memory. Remove the fc_rport->dd_data->pnode NULL assignment because the following line reassigns ndlp->rport with an fc_rport object from fc_remote_port_add() anyways. The pnode nullification is superfluous. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230523183206.7728-2-justintee8345@gmail.com Acked-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: lpfc: Replace all non-returning strlcpy() with strscpy()Azeem Shaikh2023-05-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). No return values were used, so direct replacement is safe. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Link: https://lore.kernel.org/r/20230530155745.343032-1-azeemshaikh38@gmail.com Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>