summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* wlcore: Use generic runtime pm calls for wowlan elp configurationEyal Reizer2018-06-271-38/+13
| | | | | | | | | | | | | | With runtime PM enabled, we can now use calls to pm_runtime_force_suspend and pm_runtime_force_resume for enabling elp during suspend when wowlan is enabled and waking the chip from elp on resume. Remove the custom API that was used to ensure that the command that is used to allow ELP during suspend is completed before the system suspend. Signed-off-by: Eyal Reizer <eyalr@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
* wlcore: Fix timout errors after recoveryTony Lindgren2018-06-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | After enabling runtime PM, if we force hardware reset multiple times with: # echo 1 > /sys/kernel/debug/ieee80211/phy0/wlcore/start_recovery We will after few tries get the following error: wlcore: ERROR timeout waiting for the hardware to complete initialization And then wlcore is unable to reconnect until after the wlcore related modules are reloaded. Let's fix this by moving pm_runtime_put() earlier before we restart the hardware. And let's use the sync version to make sure we're done before we restart. Note that we still will get -EBUSY warning from wl12xx_sdio_set_power() but let's fix that separately once we know exactly why we get the warning. Reported-by: Eyal Reizer <eyalr@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
* wlcore: Fix misplaced PM call for scan_complete_work()Tony Lindgren2018-06-271-2/+2
| | | | | | | | | | | | | | | | | | | With runtime PM enabled, we now need to have wlcore enabled longer until after we're done calling wlcore_cmd_regdomain_config_locked(): scan_complete_work() wlcore_cmd_regdomain_config_locked() wlcore_cmd_send_failsafe() wl12xx_sdio_raw_read() Note that this is not needed before runtime PM support as the custom PM code had it's own timer. We have not yet enabled runtime PM autosuspend for wlcore and this is why this issue now shows up. Let's fix the issues first before we enable runtime PM autosuspend. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
* wlcore: Add support for runtime PMTony Lindgren2018-06-2714-338/+416
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can update wlcore to use PM runtime by adding functions for wlcore_runtime_suspend() and wlcore_runtime_resume() and replacing calls to wl1271_ps_elp_wakeup() and wl1271_ps_elp_sleep() with calls to pm_runtime_get_sync() and pm_runtime_put(). Note that the new wlcore_runtime_suspend() and wlcore_runtime_resume() functions are based on simplified versions of wl1271_ps_elp_sleep() and wl1271_ps_elp_wakeup(). We don't want to use the old functions as we can now take advantage of the runtime PM usage count. And we don't need the old elp_work at all. And we can also remove WL1271_FLAG_ELP_REQUESTED that is no longer needed. Pretty much the only place where we are not just converting the existing functions is wl1271_op_suspend() where we add pm_runtime_put_noidle() to keep the calls paired. As the next step is to implement runtime PM autosuspend, let's not add wrapper functions for the generic runtime PM calls. We would be getting rid of any wrapper functions anyways. After autoidle we should be able to start using Linux generic wakeirqs for the padconf interrupt. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
* wlcore: Make sure PM calls are pairedTony Lindgren2018-06-271-2/+8
| | | | | | | | | The call to wl1271_ps_elp_wakeup() in wl12xx_queue_recovery_work() is unpaired. Let's remove it and add paired calls to wl1271_recovery_work() instead in preparation for changing things to use runtime PM. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
* wlcore: Add missing PM call for wlcore_cmd_wait_for_event_or_timeout()Tony Lindgren2018-06-271-0/+6
| | | | | | | | | | | | | | Otherwise we can get: WARNING: CPU: 0 PID: 55 at drivers/net/wireless/ti/wlcore/io.h:84 I've only seen this few times with the runtime PM patches enabled so this one is probably not needed before that. This seems to work currently based on the current PM implementation timer. Let's apply this separately though in case others are hitting this issue. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
* Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.gitKalle Valo2018-06-1821-112/+126
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | ath.git patches for 4.19. Major changes: ath10k * support channel 173 * fix spectral scan for QCA9984 and QCA9888 chipsets ath6kl * add support for Dell Wireless 1537
| * ath9k: debug: fix spelling mistake "WATHDOG" -> "WATCHDOG"Colin Ian King2018-06-141-1/+1
| | | | | | | | | | | | | | Trivial fix to spelling mistake in PR_IS message text. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: handle resource init failure caseGovind Singh2018-06-141-2/+2
| | | | | | | | | | | | | | | | Return type of resource init method is not assigned. Handle resource init failures for graceful exit. Signed-off-by: Govind Singh <govinds@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: skip data calibration for non-bmi targetSurabhi Vishnoi2018-06-141-7/+14
| | | | | | | | | | | | | | | | | | | | | | In non-bmi target ex. WCN3990, data calibration is handled via QMI. Skip data calibration in debug routine to enable ath10k debugfs for non bmi targets. Signed-off-by: Surabhi Vishnoi <svishnoi@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: do not mix spaces and tabs in KconfigNiklas Cassel2018-06-141-12/+12
| | | | | | | | | | | | | | Do not mix spaces and tabs in Kconfig. Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: snoc: sort include filesBrian Norris2018-06-141-7/+8
| | | | | | | | | | | | | | Sort these alphabetically, with local includes in a separate section. Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: snoc: drop unused WCN3990_CE_ATTR_FLAGSBrian Norris2018-06-141-1/+1
| | | | | | | | | | | | | | We started using a common CE_ATTR_FLAGS definition, so drop this one. Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: snoc: stop including pci.hBrian Norris2018-06-143-43/+42
| | | | | | | | | | | | | | | | | | | | | | | | It's easier to violate abstractions and introduce bugs when snoc.h is including pci.h. Let's not do that. I'm not extremely familiar with this driver yet, but several of the shared PCI/SNOC bits seem to be related to the Copy Engine, so move them to ce.h. Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: snoc: use correct bus-specific pointer in RX retryBrian Norris2018-06-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | We're 'ath10k_snoc', not 'ath10k_pci'. This probably means we're accessing junk data in ath10k_snoc_rx_replenish_retry(), unless 'ath10k_snoc' and 'ath10k_pci' happen to have very similar struct layouts. Noticed by inspection. Fixes: d915105231ca ("ath10k: add hif rx methods for wcn3990") Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: snoc: use module_platform_driver() macroBrian Norris2018-06-141-19/+1
| | | | | | | | | | | | | | | | ath10k_snoc_init()/ath10k_snoc_exit() don't add much value; module_platform_driver() can remove the boilerplate. Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: use crash_dump enum instead of magic numbersBrian Norris2018-06-141-4/+2
| | | | | | | | | | | | | | The comments are telling you what the enum could tell you instead. Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: use dma_zalloc_coherent instead of allocator/memsetYueHaibing2018-06-141-3/+1
| | | | | | | | | | | | | | | | Use dma_zalloc_coherent instead of dma_alloc_coherent followed by memset 0. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: fix incorrect size of dma_free_coherent in ath10k_ce_alloc_src_ring_64YueHaibing2018-06-141-1/+1
| | | | | | | | | | | | | | | | | | sizeof(struct ce_desc) should be a copy-paste mistake just use sizeof(struct ce_desc_64) to avoid mem leak Fixes: b7ba83f7c414 ("ath10k: add support for shadow register for WNC3990") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: make some functions staticWei Yongjun2018-06-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Fixes the following sparse warnings: drivers/net/wireless/ath/ath10k/snoc.c:823:5: warning: symbol 'ath10k_snoc_get_ce_id_from_irq' was not declared. Should it be static? drivers/net/wireless/ath/ath10k/snoc.c:871:6: warning: symbol 'ath10k_snoc_init_napi' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: fix spectral scan for QCA9984 and QCA9888 chipsetsKarthikeyan Periyasamy2018-06-143-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The spectral scan has been always broken on QCA9984 and QCA9888. Introduce a hardware parameter 'spectral_bin_offset' to resolve this issue for QCA9984 and QCA9888 chipsets. For other chipsets, the hardware parameter 'spectral_bin_offset' is zero so that existing behaviour is retained as it is. In QCA9984 and QCA9888 chipsets, hardware param value 'spectral_bin_discard' is 12 bytes. This 12 bytes is derived as the sum of segment index (4 bytes), extra bins before the actual data (4 bytes) and extra bins after the actual data (4 bytes). Always discarding (12 bytes) happens at end of the samples and incorrect samples got dumped, so that user can find incorrect arrangement samples in spectral scan dump. To fix this issue, we have to discard first 8 bytes and last 4 bytes in every samples, so totally 12 bytes are discarded. In every sample we need to consider the offset while taking the actual spectral data. For QCA9984, QCA9888 the offset is 8 bytes (segment index + extra bins before actual data). Hardware tested: QCA9984 and QCA9888 Firmware tested: 10.4-3.5.3-00053 Signed-off-by: Karthikeyan Periyasamy <periyasa@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: support use of channel 173Ben Greear2018-06-143-2/+6
| | | | | | | | | | | | | | | | | | The India regulatory domain allows CH 173, so add that to the available channel list. I verified basic connectivity between a 9880 and 9984 NIC. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: fix memory leak of tpc_statsColin Ian King2018-06-131-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently tpc_stats is allocated and is leaked on the return path if num_tx_chain is greater than WMI_TPC_TX_N_CHAIN. Avoid this leak by performing the check on num_tx_chain before the allocation of tpc_stats. Detected by CoverityScan, CID#1469422 ("Resource Leak") Fixes: 4b190675ad06 ("ath10k: fix kernel panic while reading tpc_stats") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath9k: mark expected switch fall-throughsGustavo A. R. Silva2018-06-133-0/+4
| | | | | | | | | | | | | | | | In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath6kl: mark expected switch fall-throughsGustavo A. R. Silva2018-06-131-0/+3
| | | | | | | | | | | | | | | | | | In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Steve deRosier <derosier@cal-sierra.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath6kl: add support for Dell Wireless 1537Guy Chronister2018-06-131-0/+1
| | | | | | | | | | | | | | | | | | This is a Qualcomm Atheros AR6004X with an sdio ID of 0x19 and hardware ID of 0271:0419. Tested on a Dell Venue 11 Pro 7130 with a self compiled kernel. Signed-off-by: Guy Chronister <guylovesbritt@gmail.com> [kvalo@codeaurora.org: cleanup commit log] Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath5k: mark expected switch fall-throughGustavo A. R. Silva2018-06-131-0/+1
| | | | | | | | | | | | | | | | In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * ath10k: htt_tx: mark expected switch fall-throughsGustavo A. R. Silva2018-06-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Notice that in this particular case, I replaced "pass through" with a proper "fall through" comment, which is what GCC is expecting to find. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
* | Linux 4.18-rc1v4.18-rc1Linus Torvalds2018-06-171-2/+2
| |
* | Merge tag 'for-linus-20180616' of git://git.kernel.dk/linux-blockLinus Torvalds2018-06-1716-275/+174
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block fixes from Jens Axboe: "A collection of fixes that should go into -rc1. This contains: - bsg_open vs bsg_unregister race fix (Anatoliy) - NVMe pull request from Christoph, with fixes for regressions in this window, FC connect/reconnect path code unification, and a trace point addition. - timeout fix (Christoph) - remove a few unused functions (Christoph) - blk-mq tag_set reinit fix (Roman)" * tag 'for-linus-20180616' of git://git.kernel.dk/linux-block: bsg: fix race of bsg_open and bsg_unregister block: remov blk_queue_invalidate_tags nvme-fabrics: fix and refine state checks in __nvmf_check_ready nvme-fabrics: handle the admin-only case properly in nvmf_check_ready nvme-fabrics: refactor queue ready check blk-mq: remove blk_mq_tagset_iter nvme: remove nvme_reinit_tagset nvme-fc: fix nulling of queue data on reconnect nvme-fc: remove reinit_request routine blk-mq: don't time out requests again that are in the timeout handler nvme-fc: change controllers first connect to use reconnect path nvme: don't rely on the changed namespace list log nvmet: free smart-log buffer after use nvme-rdma: fix error flow during mapping request data nvme: add bio remapping tracepoint nvme: fix NULL pointer dereference in nvme_init_subsystem blk-mq: reinit q->tag_set_list entry only after grace period
| * | bsg: fix race of bsg_open and bsg_unregisterAnatoliy Glagolev2018-06-151-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing implementation allows races between bsg_unregister and bsg_open paths. bsg_unregister and request_queue cleanup and deletion may start and complete right after bsg_get_device (in bsg_open path) retrieves bsg_class_device and releases the mutex. Then bsg_open path touches freed memory of bsg_class_device and request_queue. One possible fix is to hold the mutex all the way through bsg_get_device instead of releasing it after bsg_class_device retrieval. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-Off-By: Anatoliy Glagolev <glagolig@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | block: remov blk_queue_invalidate_tagsChristoph Hellwig2018-06-153-38/+1
| | | | | | | | | | | | | | | | | | | | | | | | This function is entirely unused, so remove it and the tag_queue_busy member of struct request_queue. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-linusJens Axboe2018-06-1511-224/+154
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NVMe fixes from Christoph: "Fix various little regressions introduced in this merge window, plus a rework of the fibre channel connect and reconnect path to share the code instead of having separate sets of bugs. Last but not least a trivial trace point addition from Hannes." * 'nvme-4.18' of git://git.infradead.org/nvme: nvme-fabrics: fix and refine state checks in __nvmf_check_ready nvme-fabrics: handle the admin-only case properly in nvmf_check_ready nvme-fabrics: refactor queue ready check blk-mq: remove blk_mq_tagset_iter nvme: remove nvme_reinit_tagset nvme-fc: fix nulling of queue data on reconnect nvme-fc: remove reinit_request routine nvme-fc: change controllers first connect to use reconnect path nvme: don't rely on the changed namespace list log nvmet: free smart-log buffer after use nvme-rdma: fix error flow during mapping request data nvme: add bio remapping tracepoint nvme: fix NULL pointer dereference in nvme_init_subsystem
| | * | nvme-fabrics: fix and refine state checks in __nvmf_check_readyChristoph Hellwig2018-06-151-20/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - make sure we only allow internally generates commands in any non-live state - only allow connect commands on non-live queues when actually in the new or connecting states - treat all other non-live, non-dead states the same as a default cach-all This fixes a regression where we could not shutdown a controller orderly as we didn't allow the internal generated Property Set command, and also ensures we don't accidentally let a Connect command through in the wrong state. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Smart <james.smart@broadcom.com>
| | * | nvme-fabrics: handle the admin-only case properly in nvmf_check_readyChristoph Hellwig2018-06-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the ADMIN_ONLY state we don't have any I/O queues, but we should accept all admin commands without further checks. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: James Smart <james.smart@broadcom.com>
| | * | nvme-fabrics: refactor queue ready checkChristoph Hellwig2018-06-155-50/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the is_connected check to the fibre channel transport, as it has no meaning for other transports. To facilitate this split out a new nvmf_fail_nonready_command helper that is called by the transport when it is asked to handle a command on a queue that is not ready. Also avoid a function call for the queue live fast path by inlining the check. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Smart <james.smart@broadcom.com>
| | * | blk-mq: remove blk_mq_tagset_iterChristoph Hellwig2018-06-142-31/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Unused now that nvme stopped using it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk>
| | * | nvme: remove nvme_reinit_tagsetChristoph Hellwig2018-06-142-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Unused now that all transports stopped using it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk>
| | * | nvme-fc: fix nulling of queue data on reconnectJames Smart2018-06-141-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reconnect path is calling the init routines to clear a queue structure. But the queue structure has state that perhaps needs to persist as long as the controller is live. Remove the nvme_fc_init_queue() calls on reconnect. The nvme_fc_free_queue() calls will clear state bits and reset any relevant queue state for a new connection. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | nvme-fc: remove reinit_request routineJames Smart2018-06-141-20/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reinit_request routine is not necessary. Remove support for the op callback. As all that nvme_reinit_tagset() does is itterate and call the reinit routine, it too has no purpose. Remove the call. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | nvme-fc: change controllers first connect to use reconnect pathJames Smart2018-06-141-57/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current code follows the framework that has been in the transports from the beginning where initial link-side controller connect occurs as part of "creating the controller". Thus that first connect fully talks to the controller and obtains values that can then be used in for blk-mq setup, etc. It also means that everything about the controller is fully know before the "create controller" call returns. This has several weaknesses: - The initial create_ctrl call made by the cli will block for a long time as wire transactions are performed synchronously. This delay becomes longer if errors occur or connectivity is lost and retries need to be performed. - Code wise, it means there is a separate connect path for initial controller connect vs the (same) steps used in the reconnect path. - And as there's separate paths, it means there's separate error handling and retry logic. It also plays havoc with the NEW state (should transition out of it after successful initial connect) vs the RESETTING and CONNECTING (reconnect) states that want to be transitioned to on error. - As there's separate paths, to recover from errors and disruptions, it requires separate recovery/retry paths as well and can severely convolute the controller state. This patch reworks the fc transport to use the same connect paths for the initial connection as it uses for reconnect. This makes a single path for error recovery and handling. This patch: - Removes the driving of the initial connect and replaces it with a state transition to CONNECTING and initiating the reconnect thread. A dummy state transition of RESETTING had to be traversed as a direct transtion of NEW->CONNECTING is not allowed. Given that the controller is "new", the RESETTING transition is a simple no-op. Once in the reconnecting thread, the normal behaviors of ctrl_loss_tmo (max_retries * connect_delay) and dev_loss_tmo will apply before the controller is torn down. - Only if the state transitions couldn't be traversed and the reconnect thread not scheduled, will the controller be torn down while in create_ctrl. - The prior code used the controller state of NEW to indicate whether request queues had been initialized or not. For the admin queue, the request queue is always created, so there's no need to check a state. For IO queues, change to tracking whether a successful io request queue create has occurred (e.g. 1st successful connect). - The initial controller id is initialized to the dynamic controller id used in the initial connect message. It will be overwritten by the real controller id once the controller is connected on the wire. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | nvme: don't rely on the changed namespace list logChristoph Hellwig2018-06-131-25/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't optimize our namespace rescan based on the changed namespace list log page as userspace might have changed the content through reading it. Suggested-by: Keith Busch <keith.busch@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@linux.intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com>
| | * | nvmet: free smart-log buffer after useChaitanya Kulkarni2018-06-111-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Free smart-log buffer allocated in the function after use. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | nvme-rdma: fix error flow during mapping request dataMax Gurtovoy2018-06-111-7/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After dma mapping the sgl, we map the sgl to nvme sgl descriptor. In case of failure during the last mapping we never dma unmap the sgl. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | nvme: add bio remapping tracepointHannes Reinecke2018-06-111-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding a tracepoint to trace bio remapping for native nvme multipath. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | nvme: fix NULL pointer dereference in nvme_init_subsystemIsrael Rukshin2018-06-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using nvme-pci driver the nvmf_ctrl_options is NULL. There is no need to check for discovery_nqn flag at non-fabrics controller. Fixes: 181303d0 ("nvme-fabrics: allow duplicate connections to the discovery controller") Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | blk-mq: don't time out requests again that are in the timeout handlerChristoph Hellwig2018-06-142-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can currently call the timeout handler again on a request that has already been handed over to the timeout handler. Prevent that with a new flag. Fixes: 12f5b931 ("blk-mq: Remove generation seqeunce") Reported-by: Andrew Randrianasulu <randrianasulu@gmail.com> Tested-by: Andrew Randrianasulu <randrianasulu@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | blk-mq: reinit q->tag_set_list entry only after grace periodRoman Pen2018-06-111-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is not allowed to reinit q->tag_set_list list entry while RCU grace period has not completed yet, otherwise the following soft lockup in blk_mq_sched_restart() happens: [ 1064.252652] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [fio:9270] [ 1064.254445] task: ffff99b912e8b900 task.stack: ffffa6d54c758000 [ 1064.254613] RIP: 0010:blk_mq_sched_restart+0x96/0x150 [ 1064.256510] Call Trace: [ 1064.256664] <IRQ> [ 1064.256824] blk_mq_free_request+0xea/0x100 [ 1064.256987] msg_io_conf+0x59/0xd0 [ibnbd_client] [ 1064.257175] complete_rdma_req+0xf2/0x230 [ibtrs_client] [ 1064.257340] ? ibtrs_post_recv_empty+0x4d/0x70 [ibtrs_core] [ 1064.257502] ibtrs_clt_rdma_done+0xd1/0x1e0 [ibtrs_client] [ 1064.257669] ib_create_qp+0x321/0x380 [ib_core] [ 1064.257841] ib_process_cq_direct+0xbd/0x120 [ib_core] [ 1064.258007] irq_poll_softirq+0xb7/0xe0 [ 1064.258165] __do_softirq+0x106/0x2a2 [ 1064.258328] irq_exit+0x92/0xa0 [ 1064.258509] do_IRQ+0x4a/0xd0 [ 1064.258660] common_interrupt+0x7a/0x7a [ 1064.258818] </IRQ> Meanwhile another context frees other queue but with the same set of shared tags: [ 1288.201183] INFO: task bash:5910 blocked for more than 180 seconds. [ 1288.201833] bash D 0 5910 5820 0x00000000 [ 1288.202016] Call Trace: [ 1288.202315] schedule+0x32/0x80 [ 1288.202462] schedule_timeout+0x1e5/0x380 [ 1288.203838] wait_for_completion+0xb0/0x120 [ 1288.204137] __wait_rcu_gp+0x125/0x160 [ 1288.204287] synchronize_sched+0x6e/0x80 [ 1288.204770] blk_mq_free_queue+0x74/0xe0 [ 1288.204922] blk_cleanup_queue+0xc7/0x110 [ 1288.205073] ibnbd_clt_unmap_device+0x1bc/0x280 [ibnbd_client] [ 1288.205389] ibnbd_clt_unmap_dev_store+0x169/0x1f0 [ibnbd_client] [ 1288.205548] kernfs_fop_write+0x109/0x180 [ 1288.206328] vfs_write+0xb3/0x1a0 [ 1288.206476] SyS_write+0x52/0xc0 [ 1288.206624] do_syscall_64+0x68/0x1d0 [ 1288.206774] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 What happened is the following: 1. There are several MQ queues with shared tags. 2. One queue is about to be freed and now task is in blk_mq_del_queue_tag_set(). 3. Other CPU is in blk_mq_sched_restart() and loops over all queues in tag list in order to find hctx to restart. Because linked list entry was modified in blk_mq_del_queue_tag_set() without proper waiting for a grace period, blk_mq_sched_restart() never ends, spining in list_for_each_entry_rcu_rr(), thus soft lockup. Fix is simple: reinit list entry after an RCU grace period elapsed. Fixes: Fixes: 705cda97ee3a ("blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list") Cc: stable@vger.kernel.org Cc: Sagi Grimberg <sagi@grimberg.me> Cc: linux-block@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | | Merge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimentalLinus Torvalds2018-06-17206-339/+372
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull documentation fixes from Mauro Carvalho Chehab: "This solves a series of broken links for files under Documentation, and improves a script meant to detect such broken links (see scripts/documentation-file-ref-check). The changes on this series are: - can.rst: fix a footnote reference; - crypto_engine.rst: Fix two parsing warnings; - Fix a lot of broken references to Documentation/*; - improve the scripts/documentation-file-ref-check script, in order to help detecting/fixing broken references, preventing false-positives. After this patch series, only 33 broken references to doc files are detected by scripts/documentation-file-ref-check" * tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental: (26 commits) fix a series of Documentation/ broken file name references Documentation: rstFlatTable.py: fix a broken reference ABI: sysfs-devices-system-cpu: remove a broken reference devicetree: fix a series of wrong file references devicetree: fix name of pinctrl-bindings.txt devicetree: fix some bindings file names MAINTAINERS: fix location of DT npcm files MAINTAINERS: fix location of some display DT bindings kernel-parameters.txt: fix pointers to sound parameters bindings: nvmem/zii: Fix location of nvmem.txt docs: Fix more broken references scripts/documentation-file-ref-check: check tools/*/Documentation scripts/documentation-file-ref-check: get rid of false-positives scripts/documentation-file-ref-check: hint: dash or underline scripts/documentation-file-ref-check: add a fix logic for DT scripts/documentation-file-ref-check: accept more wildcards at filenames scripts/documentation-file-ref-check: fix help message media: max2175: fix location of driver's companion documentation media: v4l: fix broken video4linux docs locations media: dvb: point to the location of the old README.dvb-usb file ...
| * | | | fix a series of Documentation/ broken file name referencesMauro Carvalho Chehab2018-06-158-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As files move around, their previous links break. Fix the references for them. Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: Jonathan Corbet <corbet@lwn.net>