summaryrefslogtreecommitdiffstats
path: root/drivers/scsi/megaraid/megaraid_sas_fusion.h
Commit message (Collapse)AuthorAgeFilesLines
* scsi: megaraid_sas: IRQ poll to avoid CPU hard lockupsShivasharan S2019-06-181-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue Description: We have seen cpu lock up issues from field if system has a large (more than 96) logical cpu count. SAS3.0 controller (Invader series) supports max 96 MSI-X vector and SAS3.5 product (Ventura) supports max 128 MSI-X vectors. This may be a generic issue (if PCI device support completion on multiple reply queues). Let me explain it w.r.t megaraid_sas supported h/w just to simplify the problem and possible changes to handle such issues. MegaRAID controller supports multiple reply queues in completion path. Driver creates MSI-X vectors for controller as "minimum of (FW supported Reply queues, Logical CPUs)". If submitter is not interrupted via completion on same CPU, there is a loop in the IO path. This behavior can cause hard/soft CPU lockups, IO timeout, system sluggish etc. Example - one CPU (e.g. CPU A) is busy submitting the IOs and another CPU (e.g. CPU B) is busy with processing the corresponding IO's reply descriptors from reply descriptor queue upon receiving the interrupts from HBA. If CPU A is continuously pumping the IOs then always CPU B (which is executing the ISR) will see the valid reply descriptors in the reply descriptor queue and it will be continuously processing those reply descriptor in a loop without quitting the ISR handler. megaraid_sas driver will exit ISR handler if it finds unused reply descriptor in the reply descriptor queue. Since CPU A will be continuously sending the IOs, CPU B may always see a valid reply descriptor (posted by HBA Firmware after processing the IO) in the reply descriptor queue. In worst case, driver will not quit from this loop in the ISR handler. Eventually, CPU lockup will be detected by watchdog. Above mentioned behavior is not common if "rq_affinity" set to 2 or affinity_hint is honored by irqbalancer as "exact". If rq_affinity is set to 2, submitter will be always interrupted via completion on same CPU. If irqbalancer is using "exact" policy, interrupt will be delivered to submitter CPU. Problem statement: If CPU count to MSI-X vectors (reply descriptor Queues) count ratio is not 1:1, we still have exposure of issue explained above and for that we don't have any solution. Exposure of soft/hard lockup is seen if CPU count is more than MSI-X supported by device. If CPUs count to MSI-X vectors count ratio is not 1:1, (Other way, if CPU counts to MSI-X vector count ratio is something like X:1, where X > 1) then 'exact' irqbalance policy OR rq_affinity = 2 won't help to avoid CPU hard/soft lockups. There won't be any one to one mapping between CPU to MSI-X vector instead one MSI-X interrupt (or reply descriptor queue) is shared with group/set of CPUs and there is a possibility of having a loop in the IO path within that CPU group and may observe lockups. For example: Consider a system having two NUMA nodes and each node having four logical CPUs and also consider that number of MSI-X vectors enabled on the HBA is two, then CPUs count to MSI-X vector count ratio as 4:1. e.g. MSI-X vector 0 is affinity to CPU 0, CPU 1, CPU 2 & CPU 3 of NUMA node 0 and MSI-X vector 1 is affinity to CPU 4, CPU 5, CPU 6 & CPU 7 of NUMA node 1. numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 --> MSI-X 0 node 0 size: 65536 MB node 0 free: 63176 MB node 1 cpus: 4 5 6 7 --> MSI-X 1 node 1 size: 65536 MB node 1 free: 63176 MB Assume that user started an application which uses all the CPUs of NUMA node 0 for issuing the IOs. Only one CPU from affinity list (it can be any cpu since this behavior depends upon irqbalance) CPU0 will receive the interrupts from MSI-X 0 for all the IOs. Eventually, CPU 0 IO submission percentage will be decreasing and ISR processing percentage will be increasing as it is more busy with processing the interrupts. Gradually IO submission percentage on CPU 0 will be zero and it's ISR processing percentage will be 100% as IO loop has already formed within the NUMA node 0, i.e. CPU 1, CPU 2 & CPU 3 will be continuously busy with submitting the heavy IOs and only CPU 0 is busy in the ISR path as it always find the valid reply descriptor in the reply descriptor queue. Eventually, we will observe the hard lockup here. Chances of occurring of hard/soft lockups are directly proportional to value of X. If value of X is high, then chances of observing CPU lockups is high. Solution: Use IRQ poll interface defined in "irq_poll.c". megaraid_sas driver will execute ISR routine in softirq context and it will always quit the loop based on budget provided in IRQ poll interface. Driver will switch to IRQ poll only when more than a threshold number of reply descriptors are handled in one ISR. Currently threshold is set as 1/4th of HBA queue depth. In these scenarios (i.e. where CPUs count to MSI-X vectors count ratio is X:1 (where X > 1)), IRQ poll interface will avoid CPU hard lockups due to voluntary exit from the reply queue processing based on budget. Note - Only one MSI-X vector is busy doing processing. Select CONFIG_IRQ_POLL from driver Kconfig for driver compilation. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: Add support for DEVICE_LIST DCMD in driverShivasharan S2019-02-041-0/+1
| | | | | | | | | | | | | | | | This patch adds support for the new DEVICE_LIST DCMD. Driver currently sends two separate DCMDs for getting the list of PDs and LDs that are exposed to host. The new DCMD provides a single interface to get a list of both PDs and LDs that are exposed to the host. Based on the list of target IDs that are returned by this DCMD, driver will add the devices (PD/LD) to SML. Driver will check for FW support for this new DCMD and based on the support will either send the new DCMD or will fall back to the earlier method of sending two separate DCMDs for PD and LD list. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: Update copyright informationShivasharan S2018-11-061-8/+6
| | | | | | | | Change copyright to Broadcom Inc. Also update any references to Avago with Broadcom. Update copyright duration wherever required. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: Add support for FW snap dumpShivasharan S2018-11-061-0/+12
| | | | | | | | | | | | | Latest firmware adds a mechanism to save firmware logs just before controller reset on pre-allocated internal controller DRAM. This feature is called snapdump which will help debugging firmware issues. This feature requires extra time and firmware reports these values through new driver interface. Before initiating an OCR, driver needs to inform FW to save a snapdump and then wait for a specified time for the snapdump to complete. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: re-work DCMD refire codeShivasharan S2018-01-101-0/+6
| | | | | | | | | | | | No functional changes. This patch is a re-work of DCMD refire code to better manage all the different cases to decide whether to REFIRE or SKIP or COMPLETE certain DCMD. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: Add support for 64bit consistent DMAShivasharan S2017-10-251-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The latest MegaRAID Firmware (for Invader series) has support for 64bit DMA for both streaming and consistent DMA buffers. All Ventura series controller FW always support 64 bit consistent DMA. Also, on a few architectures 32bit DMA is not supported. Current driver always prefers 32bit for consistent DMA and 64bit for streaming DMA. This behavior was unintentional and carried forwarded from legacy controller FW. Need to enhance the driver to support 64bit consistent DMA buffers based on the firmware capability. Below is the DMA setting strategy in driver with this patch. For Ventura series, always try to set 64bit DMA mask. If it fails fall back to 32bit DMA mask. For Invader series and earlier generation controllers, first try to set to 32bit consistent DMA mask irrespective of FW capability. This is needed to ensure firmware downgrades do not break. If 32bit DMA setting fails, check FW capability and try seting to 64bit DMA mask. There are certain restrictions in the hardware for having all sense buffers and all reply descriptors to be in the same 4GB memory region. This limitation is h/w dependent and can not be changed in firmware. This limitation needs to be taken care in driver while allocating the buffers. There was a discussion regarding this - find details at below link. https://www.spinics.net/lists/linux-scsi/msg108251.html Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: Retry with reduced queue depth when alloc fails for ↵Shivasharan S2017-10-251-0/+1
| | | | | | | | | | | | | | | higher QD In certain cases, the host memory is limited and with FW supporting higher queue depths there are increasing chances of IO request frame allocation failures that we are seeing. In case of request frame allocation failures, retry allocation with reduced queue depth (in steps of 64) to continue to configure the controller with a reduced performance rather than failing load. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: Resize MFA frame used for IOC INIT to 4kShivasharan S2017-10-251-0/+2
| | | | | | | | | | | | | Older firmware version unconditionally pulls 4k frame for IOC INIT MFA frame. But driver allocates 1k or 4k max_chain_frame_sz based on FW capability. During boot time, this results in DMA read errors. Workaround fix in driver by allocating separate ioc_init frame of 4k size to support older firmware. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: Pre-allocate frequently used DMA buffersShivasharan S2017-10-251-0/+3
| | | | | | | | Pre-allocate few of the frequently used DMA buffers during load time. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: reduce size of fusion_context and use kmalloc for allocationShivasharan S2017-10-251-1/+2
| | | | | | | | | | | | | | | | | fusion_context structure is very large around 180kB and most of the size is contributed by log_to_span array. Move log_to_span out of fusion context and have separate allocation for log_to_span. And use kmalloc to allocate fusion_context. Currently kmemleak reports 1000s of false positives for fusion->cmd_list[]. kmemleak does not track page allocation for fusion_context. This change will also fix the false positives reported by kmemleak. Ref: https://marc.info/?l=linux-scsi&m=150545293900917 Reported-by: Shu Wang <shuwang@redhat.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: use adapter_type for all gen controllersShivasharan S2017-10-251-7/+0
| | | | | | | | | | No functional change. Refactor adapter_type to set for all generation controllers, not just for fusion controllers. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: Change RAID_1_10_RMW_CMDS to RAID_1_PEER_CMDS and set ↵Shivasharan S2017-02-131-1/+1
| | | | | | | | | | | | | | | | value to 2 For RAID1 FastPath writes, driver needs to allocate extra commands internally to accommodate for the extra peer command being sent. Currently driver is allocating 2 extra commands for each but only one extra command is necessary. Set RAID_1_10_RMW_CMDS to 2 and also change macro name to RAID_1_PEER_CMDS. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: Indentation and smatch warning fixesShivasharan S2017-02-131-1/+1
| | | | | | | | | | | | | | Fix indentation issues and smatch warning reported by Dan Carpenter for previous series as discussed below. http://www.spinics.net/lists/linux-scsi/msg103635.html http://www.spinics.net/lists/linux-scsi/msg103603.html Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: big endian support changesShivasharan S2017-02-131-43/+72
| | | | | | | | | | Fix endiannes fixes for Ventura specific. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: reduce size of fusion_context and use vmalloc if kmalloc ↵Shivasharan S2017-02-131-1/+2
| | | | | | | | | | | | | | | fails Currently fusion context has fixed array load_balance_info. Use dynamic allocation. In few places, driver do not want physically contigious memory. Attempt to use vmalloc if physical contiguous memory is not available. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: NVME fast path io supportShivasharan S2017-02-131-1/+13
| | | | | | | | | | | | This patch provide true fast path IO support. Driver creates PRP for NVME drives and send Fast Path for performance. Certain h/w requirement needs to be taken care in driver. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: NVME interface target prop addedShivasharan S2017-02-131-0/+1
| | | | | | | | | | | This patch fetch true values of NVME property from FW using New DCMD interface MR_DCMD_DEV_GET_TARGET_PROP Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: NVME Interface detection and prop settingsShivasharan S2017-02-131-1/+1
| | | | | | | | | | | | | | Adding detection logic for NVME device attached behind Ventura controller. Driver set HostPageSize in IOC_INIT frame to inform about page size for NVME devices. Firmware reports NVME page size to the driver. PD INFO DCMD provide new interface type NVME_PD. Driver set property of NVME device. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: raid 1 fast path code optimizeShivasharan S2017-02-131-2/+1
| | | | | | | | | | | | | | | | | No functional change. Code refactor. Remove function megasas_fpio_to_ldio as we never require to convert fpio to ldio because of frame unavailability. Grab extra frame of raid 1 write fast path before it creates first frame as Fast Path. Removed is_raid_1_fp_write flag as raid 1 write fast path command is decided using r1_alt_dev_handle only. Move resetting megasas_cmd_fusion fields at common function megasas_return_cmd_fusion. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* Revert "scsi: megaraid_sas: Enable or Disable Fast path based on the PCI ↵Shivasharan S2017-02-131-1/+1
| | | | | | | | | | | | | | | | | | Threshold Bandwidth" This reverts commit "3e5eadb1a881" ("scsi: megaraid_sas: Enable or Disable Fast path based on the PCI Threshold Bandwidth") This patch was aimed to increase performance of R1 Write operation for large IO size. Since this method used timer approach, it turn on/off fast path did not work as expected. Patch 0013 describes new algorithm and performance number. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: Implement the PD Map support for SAS3.5 Generic Megaraid ↵Sasikumar Chandrasekaran2017-01-101-1/+2
| | | | | | | | | | Controllers Update Linux driver to use new pdTargetId field for JBOD target ID Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: Enable or Disable Fast path based on the PCI Threshold ↵Sasikumar Chandrasekaran2017-01-101-1/+1
| | | | | | | | | | Bandwidth Large SEQ IO workload should sent as non fast path commands Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: Add the Support for SAS3.5 Generic Megaraid Controllers ↵Sasikumar Chandrasekaran2017-01-101-0/+1
| | | | | | | | | | Capabilities The Megaraid driver has to support the SAS3.5 Generic Megaraid Controllers Firmware functionality. Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: Dynamic Raid Map Changes for SAS3.5 Generic Megaraid ↵Sasikumar Chandrasekaran2017-01-101-34/+206
| | | | | | | | | | | Controllers SAS3.5 Generic Megaraid Controllers FW will support new dynamic RaidMap to have different sizes for different number of supported VDs. Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: SAS3.5 Generic Megaraid Controllers Fast Path for RAID ↵Sasikumar Chandrasekaran2017-01-101-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1/10 Writes To improve RAID 1/10 Write performance, OS drivers need to issue the required Write IOs as Fast Path IOs (after the appropriate checks allowing Fast Path to be used) to the appropriate physical drives (translated from the OS logical IO) and wait for all Write IOs to complete. Design: A write IO on RAID volume will be examined if it can be sent in Fast Path based on IO size and starting LBA and ending LBA falling on to a Physical Drive boundary. If the underlying RAID volume is a RAID 1/10, driver issues two fast path write IOs one for each corresponding physical drive after computing the corresponding start LBA for each physical drive. Both write IOs will have the same payload and are posted to HW such that replies land in the same reply queue. If there are no resources available for sending two IOs, driver will send the original IO from SCSI layer to RAID volume through the Firmware. Based on PCI bandwidth and write payload, every second this feature is enabled/disabled. When both IOs are completed by HW, the resources will be released and SCSI IO completion handler will be called. Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: SAS3.5 Generic Megaraid Controllers Stream Detection and ↵Sasikumar Chandrasekaran2017-01-101-3/+114
| | | | | | | | | | | | | | | | | | | | IO Coalescing Detect sequential Write IOs and pass the hint that it is part of sequential stream to help HBA Firmware do the Full Stripe Writes. For read IOs on certain RAID volumes like Read Ahead volumes,this will help driver to send it to Firmware even if the IOs can potentially be sent to hardware directly (called fast path) bypassing firmware. Design: 8 streams are maintained per RAID volume as per the combined firmware/driver design. When there is no stream detected the LRU stream is used for next potential stream and LRU/MRU map is updated to make this as MRU stream. Every time a stream is detected the MRU map is updated to make the current stream as MRU stream. Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: EEDP Escape Mode Support for SAS3.5 Generic Megaraid ↵Sasikumar Chandrasekaran2017-01-101-0/+2
| | | | | | | | | | | | | | | | | Controllers An UNMAP command on a PI formatted device will leave the Logical Block Application Tag and Logical Block Reference Tag as all F's (for those LBAs that are unmapped). To avoid IO errors if those LBAs are subsequently read before they are written with valid tag fields, the MPI SCSI IO requests need to set the EEDPFlags element EEDP Escape Mode field, Bits [7:6] appropriately. A value of 2 should be set to disable all PI checks if the Logical Block Application Tag is 0xFFFF for PI types 1 and 2. A value of 3 should be set to disable all PI checks if the Logical Block Application Tag is 0xFFFF and the Logical Block Reference Tag is 0xFFFFFFFF for PI type 3. Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: megaraid_sas: clean function declarations in megaraid_sas_base.c upBaoyou Xie2016-09-191-0/+9
| | | | | | | | | | | | | | | | | | We get a few warnings when building kernel with W=1: drivers/scsi/megaraid/megaraid_sas_fusion.c:281:1: warning: no previous prototype for 'megasas_free_cmds_fusion' [-Wmissing-prototypes] drivers/scsi/megaraid/megaraid_sas_fusion.c:714:1: warning: no previous prototype for 'megasas_ioc_init_fusion' [-Wmissing-prototypes] .... In fact, these functions are declared in drivers/scsi/megaraid/megaraid_sas_base.c, but should be declared in a header file, thus can be recognized in other file. So this patch adds the declarations into drivers/scsi/megaraid/megaraid_sas_fusion.h. Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* megaraid_sas: Reply Descriptor Post Queue (RDPQ) supportSumit Saxena2016-02-231-2/+10
| | | | | | | | | | | | | | | | | | | | This patch will create a reply queue pool for each MSI-X index and will provide an array of base addresses instead of the single address of legacy mode. Using this new interface the driver can support higher queue depths through scattered DMA pools. If array mode is not supported driver will fall back to the legacy method of reply pool allocation. This limits controller queue depth to 1K max. To enable a queue depth of more than 1K driver requires firmware to support array mode and scratch_pad3 will provide the new queue depth value. When RDPQ is used, downgrading to an older firmware release should not be permitted. This may cause firmware fault and is not supported. Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* megaraid_sas: Fastpath region lock bypassSumit Saxena2016-02-231-3/+5
| | | | | | | | | | | | Firmware will fill out per-LD data to tell driver whether a particular LD supports region lock bypass. If yes, then driver will send non-FP LDIO to region lock bypass FIFO. With this change in driver, firmware will optimize certain code to improve performance. Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* megaraid_sas: Task management supportSumit Saxena2016-02-231-5/+112
| | | | | | | | | | | | | | | | | | | | | | | | This patch adds task management for SCSI commands. Added functions are task abort and target reset. 1. Currently, megaraid_sas driver performs controller reset when any IO times out. With task management support added, task abort and target reset will be tried to recover timed out IO. If task management fails, then controller reset will be performaned. If the task management request times out, fail the request and escalate to the next level (controller reset). 2. mr_device_priv_data will be allocated for all generations of controller, but is_tm_capable flag will never be set for controllers (prior to Invader series) as firmware support is not available for task management. 3. Task management capable firmware will set is_tm_capable flag in firmware API. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* megaraid_sas: Syncing request flags macro names with firmwareSumit Saxena2016-02-231-1/+2
| | | | | | Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* megaraid_sas: Remove PCI id checkssumit.saxena@avagotech.com2015-10-291-0/+6
| | | | | | | | | | | | | Remove PCI id based checks and use instance->ctrl_context to decide whether controller is MFI-based or a Fusion adapter. Additionally, Fusion adapters are divided into two categories: Thunderbolt and Invader. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* megaraid_sas: Support for max_io_size 1MBsumit.saxena@avagotech.com2015-10-291-2/+8
| | | | | | | | | | Driver will expose max sge = 256 (earlier it was 64) if firmware supports extended IO size (1M). Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Martin Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* megaraid_sas: JBOD sequence number supportsumit.saxena@avagotech.com2015-10-291-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implemented JBOD map which will provide quick access for JBOD path and also provide sequence number. This will help hardware to fail command to the FW in case of any sequence mismatch. Fast Path I/O for JBOD will refer JBOD map (which has sequence number per JBOD device) instead of RAID map. Previously, the driver used RAID map to get device handle for fast path I/O and this not have sequence number information. Now, driver will use JBOD map instead. As part of error handling, if JBOD map is failed/not supported by firmware, driver will continue using legacy behavior. Now there will be three IO paths for JBOD (syspd): - JBOD map with sequence number (Fast Path) - RAID map without sequence number (Fast Path) - FW path via h/w exception queue deliberately setup devhandle 0xFFFF (FW path). Relevant data structures: - Driver send new DCMD MR_DCMD_SYSTEM_PD_MAP_GET_INFO for this purpose. - struct MR_PD_CFG_SEQ- This structure represent map of single physical device. - struct MR_PD_CFG_SEQ_NUM_SYNC- This structure represent whole JBOD map in general(size, count of sysPDs configured, struct MR_PD_CFG_SEQ of syspD with 0 index). - JBOD sequence map size is: sizeof(struct MR_PD_CFG_SEQ_NUM_SYNC) + (sizeof(struct MR_PD_CFG_SEQ) * (MAX_PHYSICAL_DEVICES - 1)) which is allocated while setting up JBOD map at driver load time. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Martin Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* megaraid_sas: Synchronize driver headers with firmware APIssumit.saxena@avagotech.com2015-10-291-1/+2
| | | | | | | Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Martin Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* megaraid_sas : add endianness annotationsChristoph Hellwig2015-05-251-138/+138
| | | | | | | | | | | This adds endianness annotations to all data structures, and a few variables directly referencing them. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>
* megaraid_sas : Use Block layer tag support for internal command indexingSumit.Saxena@avagotech.com2015-05-251-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | megaraid_sas driver will use block layer provided tag for indexing internal MPT frames to get any unique MPT frame tied with tag. Each IO request submitted from SCSI mid layer will get associated MPT frame from MPT framepool (retrieved and return back using spinlock inside megaraid_sas driver's submission/completion call back). Getting MPT frame from MPT Frame pool is very expensive operation because of associated spin lock operation (spinlock overhead increase on multi NUMA node). This type of locking in driver is very expensive call considering each IO request need - Acquire and Release of the same lock. With this support, in IO path driver will directly provide the unique command index(which is based on block layer tag) and will get the MPT frame tied to the tag and this way driver can get rid off lock, which synchronizes the access to MPT frame pool while fetching and returning MPT frame from the pool. This support in driver provides siginificant performance improvement(on multi NUMA node system)on latest upstream with SCSI.MQ as well as on existing linux distributions. Here is the data for test executed at Avago- - IO Tool- FIO - 4 Socket SMC server. (4 NUMA node server) - 12 SSDs in JBOD mode . - 4K Rand READ, QD=32 - SCSI MQ x86_64 (Latest Upstream kernel) - upto 300% Performance Improvement. If IOs are running on single Node, perfromance gain is less, but as soon as increase number of nodes, performance improvement is significant. IOs running on all 4 NUMA nodes, with this patch applied IOPs observed was 1170K vs 344K IOPs seen without this patch. Logically, there are two parts of this patch- 1) Block layer tag support 2) changes in calling convention of return_cmd. part 2 will revert the changes done by patch- 90dc9d9 megaraid_sas : MFI MPT linked list corruption fix because changes done in part 1 has fixed the problem of MFI MPT linked list corruption. part 2 is very much dependent on part 1, so we decided to have single patch for these two logical changes. [jejb: remove chatty printk pointed out by hch] Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>
* megaraid_sas: endianness related bug fixes and code optimizationSumit.Saxena@avagotech.com2015-01-091-7/+2
| | | | | | | | | | | | | | | | | | | This patch addresses below issues: 1) Few endianness bug fixes. 2) Break the iteration after (MAX_LOGICAL_DRIVES_EXT - 1)), instead of MAX_LOGICAL_DRIVES_EXT. 3) Optimization in MFI INIT frame before firing. 4) MFI IO frame should be 256bytes aligned. Code is optimized to reduce the size of frame for fusion adapters and make the MFI frame size calculation a bit transparent and readable. Cc: <stable@vger.kernel.org> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Chaitra Basappa <chaitra.basappa@avagotech.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* megaraid_sas: online Firmware upgrade support for Extended VD featureSumit.Saxena@avagotech.com2014-11-241-2/+0
| | | | | | | | | | | | | In OCR (Online Controller Reset) path, driver sets adapter state to MEGASAS_HBA_OPERATIONAL before getting new RAID map. There will be a small window where IO will come from OS with old RAID map. This patch will update adapter state to MEGASAS_HBA_OPERATIONAL, only after driver has new RAID map to avoid any IOs getting build using old RAID map. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* megaraid_sas: update MAINTAINERS and copyright information for megaraid driversSumit.Saxena@avagotech.com2014-11-241-7/+9
| | | | | | | | Update MAINTAINERS list and copyright information for megaraid_sas driver. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* megaraid_sas : MFI MPT linked list corruption fixSumit.Saxena@avagotech.com2014-09-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resending the patch. Addressed the review comments from Tomas Henzl. Added comment for to-do work. Problem statement: MFI link list in megaraid_sas driver is used from mfi-mpt pass-through commands. This list can be corrupted due to many possible race conditions in driver and eventually we may see kernel panic. One example - MFI frame is freed from calling process as driver send command via polling method and interrupt for that command comes after driver free mfi frame (actually even after some other context reuse the mfi frame). When driver receive MPT frame in ISR, driver will be using the index of MFI and access that MFI frame and finally in-used MFI frame’s list will be corrupted. High level description of new solution - Free MFI and MPT command from same context. Free both the command either from process (from where mfi-mpt pass-through was called) or from ISR context. Do not split freeing of MFI and MPT, because it creates the race condition which will do MFI/MPT list corruption. Renamed the cmd_pool_lock which is used in instance as well as fusion with below name. mfi_pool_lock and mpt_pool_lock to add more code readability. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* megaraid_sas : N-drive primary raid level 1 load balancingSumit.Saxena@avagotech.com2014-09-161-3/+5
| | | | | | | | | | | | | | Resending the patch. Addressed the review comments from Tomas Henzl. Current driver does fast path read load balancing between arm and mirror disk for two Drive Raid-1 configuration only. Now, Driver support fast path read load balancing for all (any number of disk) Raid-1 configuration. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* megaraid_sas : Extended VD supportSumit.Saxena@avagotech.com2014-09-161-4/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resending the patch. Addressed the review comments from Tomas Henzl. reserved1 field(part of union) of Raid map struct was not required so it is removed. Current MegaRAID firmware and hence the driver only supported 64VDs. E.g: If the user wants to create more than 64VD on a controller, it is not possible on current firmware/driver. New feature and requirement to support upto 256VD, firmware/driver/apps need changes. In addition to that there must be a backward compatibility of the new driver with the older firmware and vice versa. RAID map is the interface between Driver and FW to fetch all required fields(attributes) for each Virtual Drives. In the earlier design driver was using the FW copy of RAID map where as in the new design the Driver will keep the RAID map copy of its own; on which it will operate for any raid map access in fast path. Local driver raid map copy will provide ease of access through out the code and provide generic interface for future FW raid map changes. For the backward compatibility driver will notify FW that it supports 256VD to the FW in driver capability field. Based on the controller properly returned by the FW, the Driver will know whether it supports 256VD or not and will copy the RAID map accordingly. At any given time, driver will always have old or new Raid map. So with this changes, driver can also work in host lock less mode. Please see next patch which enable host lock less mode for megaraid_sas driver. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* megaraid_sas : Update threshold based reply post host index registerSumit.Saxena@avagotech.com2014-09-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Resending the patch. Addressed the review comments from Tomas Henzl. Current driver updates reply post host index to let firmware know that replies are processed, while returning from ISR function, only if there is no oustanding replies in reply queue. Driver will free the request frame immediately from ISR but reply post host index is not yet updated. It means freed request can be used by submission path and there may be a tight loop in request/reply path. In such condition, firmware may crash when it tries to post reply and there is no free reply post descriptor. Eventually two things needs to be change to avoid this issue. Increase reply queue depth (double than request queue) to accommodate worst case scenario. Update reply post host index to firmware once it reach to some pre-defined threshold value. This change will make sure that firmware will always have some buffer of reply descriptor and will never find empty reply descriptor in completion path. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* [SCSI] megaraid_sas: Add Dell PowerEdge VRTX SR-IOV VF supportadam radford2014-03-151-0/+3
| | | | | | | | | | | The following patch for megaraid_sas adds Dell PowerEdge VRTS SR-IOV VF support (Device ID 0x002f). This patch has some > 80 column lines that need to be left in place for code readability purposes. Signed-off-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2013-09-151-2/+29
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull misc SCSI driver updates from James Bottomley: "This patch set is a set of driver updates (megaraid_sas, fnic, lpfc, ufs, hpsa) we also have a couple of bug fixes (sd out of bounds and ibmvfc error handling) and the first round of esas2r checker fixes and finally the much anticipated big endian additions for megaraid_sas" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (47 commits) [SCSI] fnic: fnic Driver Tuneables Exposed through CLI [SCSI] fnic: Kernel panic while running sh/nosh with max lun cfg [SCSI] fnic: Hitting BUG_ON(io_req->abts_done) in fnic_rport_exch_reset [SCSI] fnic: Remove QUEUE_FULL handling code [SCSI] fnic: On system with >1.1TB RAM, VIC fails multipath after boot up [SCSI] fnic: FC stat param seconds_since_last_reset not getting updated [SCSI] sd: Fix potential out-of-bounds access [SCSI] lpfc 8.3.42: Update lpfc version to driver version 8.3.42 [SCSI] lpfc 8.3.42: Fixed issue of task management commands having a fixed timeout [SCSI] lpfc 8.3.42: Fixed inconsistent spin lock usage. [SCSI] lpfc 8.3.42: Fix driver's abort loop functionality to skip IOs already getting aborted [SCSI] lpfc 8.3.42: Fixed failure to allocate SCSI buffer on PPC64 platform for SLI4 devices [SCSI] lpfc 8.3.42: Fix WARN_ON when driver unloads [SCSI] lpfc 8.3.42: Avoided making pci bar ioremap call during dual-chute WQ/RQ pci bar selection [SCSI] lpfc 8.3.42: Fixed driver iocbq structure's iocb_flag field running out of space [SCSI] lpfc 8.3.42: Fix crash on driver load due to cpu affinity logic [SCSI] lpfc 8.3.42: Fixed logging format of setting driver sysfs attributes hard to interpret [SCSI] lpfc 8.3.42: Fixed back to back RSCNs discovery failure. [SCSI] lpfc 8.3.42: Fixed race condition between BSG I/O dispatch and timeout handling [SCSI] lpfc 8.3.42: Fixed function mode field defined too small for not recognizing dual-chute mode ...
| * [SCSI] megaraid_sas: addded support for big endian architectureSumit.Saxena@lsi.com2013-09-101-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch will add big endian architecture support to megaraid_sas driver. The support added is for LSI MegaRAID all generation controllers- (3Gb/s, 6Gb/s and 12 Gb/s controllers). We have done basic sanity test @ppc64 arch and @x86_64. Additional testing/observations are welcome. [jejb: fix up rejections] Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: Sumit Saxena <sumit.saxena@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] megaraid_sas: Add High Availability clustering support using shared ↵adam radford2013-09-061-2/+5
| | | | | | | | | | | | | | Logical Disks Signed-off-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* | scsi/megaraid fixed several typos in commentsMatthias Schid2013-08-201-1/+1
|/ | | | | | | | | | Fixed several typos in comments in megaraid_mbox.c, megaraid_mm.c and megaraid_sas_fusion.h. Signed-off-by: Matthias Schid <aircrach115@gmail.com> Signed-off-by: Stefan Huber <steffhip@gmail.com> Signed-off-by: Simon Puels <simon.puels@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>