summaryrefslogtreecommitdiffstats
path: root/drivers/nvme/host
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'for-linus-20190524' of git://git.kernel.dk/linux-blockLinus Torvalds2019-05-243-41/+76
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block fixes from Jens Axboe: - NVMe pull request from Keith, with fixes from a few folks. - bio and sbitmap before atomic barrier fixes (Andrea) - Hang fix for blk-mq freeze and unfreeze (Bob) - Single segment count regression fix (Christoph) - AoE now has a new maintainer - tools/io_uring/ Makefile fix, and sync with liburing (me) * tag 'for-linus-20190524' of git://git.kernel.dk/linux-block: (23 commits) tools/io_uring: sync with liburing tools/io_uring: fix Makefile for pthread library link blk-mq: fix hang caused by freeze/unfreeze sequence block: remove the bi_seg_{front,back}_size fields in struct bio block: remove the segment size check in bio_will_gap block: force an unlimited segment size on queues with a virt boundary block: don't decrement nr_phys_segments for physically contigous segments sbitmap: fix improper use of smp_mb__before_atomic() bio: fix improper use of smp_mb__before_atomic() aoe: list new maintainer for aoe driver nvme-pci: use blk-mq mapping for unmanaged irqs nvme: update MAINTAINERS nvme: copy MTFA field from identify controller nvme: fix memory leak for power latency tolerance nvme: release namespace SRCU protection before performing controller ioctls nvme: merge nvme_ns_ioctl into nvme_ioctl nvme: remove the ifdef around nvme_nvm_ioctl nvme: fix srcu locking on error return in nvme_get_ns_from_disk nvme: Fix known effects nvme-pci: Sync queues on reset ...
| * nvme-pci: use blk-mq mapping for unmanaged irqsKeith Busch2019-05-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a device is providing a single IRQ vector, the IO queue will share that vector with the admin queue. This is an unmanaged vector, so does not have a valid PCI IRQ affinity. Avoid trying to extract a managed affinity in this case and let blk-mq set up the cpu:queue mapping instead. Otherwise we'd hit the following warning when the device is using MSI: WARNING: CPU: 4 PID: 7 at drivers/pci/msi.c:1272 pci_irq_get_affinity+0x66/0x80 Modules linked in: nvme nvme_core serio_raw CPU: 4 PID: 7 Comm: kworker/u16:0 Tainted: G W 5.2.0-rc1+ #494 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Workqueue: nvme-reset-wq nvme_reset_work [nvme] RIP: 0010:pci_irq_get_affinity+0x66/0x80 Code: 0b 31 c0 c3 83 e2 10 48 c7 c0 b0 83 35 91 74 2a 48 8b 87 d8 03 00 00 48 85 c0 74 0e 48 8b 50 30 48 85 d2 74 05 39 70 14 77 05 <0f> 0b 31 c0 c3 48 63 f6 48 8d 04 76 48 8d 04 c2 f3 c3 48 8b 40 30 RSP: 0000:ffffb5abc01d3cc8 EFLAGS: 00010246 RAX: ffff9536786a39c0 RBX: 0000000000000000 RCX: 0000000000000080 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9536781ed000 RBP: ffff95367346a008 R08: ffff95367d43f080 R09: ffff953678c07800 R10: ffff953678164800 R11: 0000000000000000 R12: 0000000000000000 R13: ffff9536781ed000 R14: 00000000ffffffff R15: ffff95367346a008 FS: 0000000000000000(0000) GS:ffff95367d400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdf814a3ff0 CR3: 000000001a20f000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: blk_mq_pci_map_queues+0x37/0xd0 nvme_pci_map_queues+0x80/0xb0 [nvme] blk_mq_alloc_tag_set+0x133/0x2f0 nvme_reset_work+0x105d/0x1590 [nvme] process_one_work+0x291/0x530 worker_thread+0x218/0x3d0 ? process_one_work+0x530/0x530 kthread+0x111/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x1f/0x30 ---[ end trace 74587339d93c83c0 ]--- Fixes: 22b5560195bd6 ("nvme-pci: Separate IO and admin queue IRQ vectors") Reported-by: Iván Chavero <ichavero@chavero.com.mx> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Keith Busch <keith.busch@intel.com>
| * nvme: copy MTFA field from identify controllerLaine Walker-Avina2019-05-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use the controller's reported maximum firmware activation time as our timeout before resetting a controller for a failed activation notice, but this value was never being read so we could only use the default timeout. Copy the Identify Controller MTFA field to the corresponding nvme_ctrl's mtfa field. Fixes: b6dccf7fae433 (“nvme: add support for FW activation without reset”). Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Minwoo Im <minwoo.im@samsung.com> Signed-off-by: Laine Walker-Avina <laine.walker-avina@intel.com> [changelog, fix endian] Signed-off-by: Keith Busch <keith.busch@intel.com>
| * nvme: fix memory leak for power latency toleranceYufen Yu2019-05-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Unconditionally hide device pm latency tolerance when uninitializing the controller to ensure all qos resources are released so that we're not leaking this memory. This is safe to call if none were allocated in the first place, or were previously freed. Fixes: c5552fde102fc("nvme: Enable autonomous power state transitions") Suggested-by: Keith Busch <keith.busch@intel.com> Tested-by: David Milburn <dmilburn@redhat.com> Signed-off-by: Yufen Yu <yuyufen@huawei.com> [changelog] Signed-off-by: Keith Busch <keith.busch@intel.com>
| * nvme: release namespace SRCU protection before performing controller ioctlsChristoph Hellwig2019-05-171-5/+20
| | | | | | | | | | | | | | | | | | | | | | | | Holding the SRCU critical section protecting the namespace list can cause deadlocks when using the per-namespace admin passthrough ioctl to delete as namespace. Release it earlier when performing per-controller ioctls to avoid that. Reported-by: Kenneth Heitke <kenneth.heitke@intel.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvme: merge nvme_ns_ioctl into nvme_ioctlChristoph Hellwig2019-05-171-23/+24
| | | | | | | | | | | | | | | | Merge the two functions to make future changes a little easier. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
| * nvme: remove the ifdef around nvme_nvm_ioctlChristoph Hellwig2019-05-171-2/+0
| | | | | | | | | | | | | | | | | | We already have a proper stub if lightnvm is not enabled, so don't bother with the ifdef. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
| * nvme: fix srcu locking on error return in nvme_get_ns_from_diskChristoph Hellwig2019-05-171-4/+9
| | | | | | | | | | | | | | | | | | | | If we can't get a namespace don't leak the SRCU lock. nvme_ioctl was working around this, but nvme_pr_command wasn't handling this properly. Just do what callers would usually expect. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
| * nvme: Fix known effectsKeith Busch2019-05-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | We're trying to append known effects to the ones reported in the controller's log. The original patch accomplished this, but something went wrong when patch was merged causing the effects log to override the known effects. Link: http://lists.infradead.org/pipermail/linux-nvme/2019-May/023710.html Fixes: f4524cc45626 ("nvme-pci: add known admin effects to augument admin effects log page") Cc: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Keith Busch <keith.busch@intel.com>
| * nvme-pci: Sync queues on resetKeith Busch2019-05-173-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | A controller with multiple namespaces may have multiple request_queues with their own timeout work. If a controller fails with IO outstanding to diffent namespaces, each request queue may attempt to handle it, so ensure there is no previously scheduled timeout work executing prior to starting controller initialization by synchronizing with each queue. Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <keith.busch@intel.com>
| * nvme-pci: Unblock reset_work on IO failureKeith Busch2019-05-171-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The reset_work waits for queued IO to complete before setting the controller to live. If any of these times out and requeues, we won't be able to restart the controller because the reset_work is already running. Flush all entered requests to a failed completion if a timeout occurs in the connecting state, and ensure the controller can't transition to the live state after we've unblocked it from waiting for completions. Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <keith.busch@intel.com>
| * nvme-pci: Don't disable on timeout in reset stateKeith Busch2019-05-171-1/+2
| | | | | | | | | | | | | | | | | | The reset state doesn't dispatch commands that it needs to wait for anymore. If a timeout occurs in this state, the reset work is already disabling the controller, so just reset the request's timer. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <keith.busch@intel.com>
| * nvme-pci: Fix controller freeze wait disablingKeith Busch2019-05-171-6/+6
| | | | | | | | | | | | | | | | | | If a controller disabling didn't start a freeze, don't wait for the operation to complete. Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <keith.busch@intel.com>
* | treewide: Add SPDX license identifier - Makefile/KconfigThomas Gleixner2019-05-211-0/+1
|/ | | | | | | | | | | | | | Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge tag 'for-5.2/block-post-20190516' of git://git.kernel.dk/linux-blockLinus Torvalds2019-05-168-76/+63
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull more block updates from Jens Axboe: "This is mainly some late lightnvm changes that came in just before the merge window, as well as fixes that have been queued up since the initial pull request was frozen. This contains: - lightnvm changes, fixing race conditions, improving memory utilization, and improving pblk compatability (Chansol, Igor, Marcin) - NVMe pull request with minor fixes all over the map (via Christoph) - remove redundant error print in sata_rcar (Geert) - struct_size() cleanup (Jackie) - dasd CONFIG_LBADF warning fix (Ming) - brd cond_resched() improvement (Mikulas)" * tag 'for-5.2/block-post-20190516' of git://git.kernel.dk/linux-block: (41 commits) block/bio-integrity: use struct_size() in kmalloc() nvme: validate cntlid during controller initialisation nvme: change locking for the per-subsystem controller list nvme: trace all async notice events nvme: fix typos in nvme status code values nvme-fabrics: remove unused argument nvme-multipath: avoid crash on invalid subsystem cntlid enumeration nvme-fc: use separate work queue to avoid warning nvme-rdma: remove redundant reference between ib_device and tagset nvme-pci: mark expected switch fall-through nvme-pci: add known admin effects to augument admin effects log page nvme-pci: init shadow doorbell after each reset brd: add cond_resched to brd_free_pages sata_rcar: Remove ata_host_alloc() error printing s390/dasd: fix build warning in dasd_eckd_build_cp_raw lightnvm: pblk: use nvm_rq_to_ppa_list() lightnvm: pblk: simplify partial read path lightnvm: do not remove instance under global lock lightnvm: track inflight target creations lightnvm: pblk: recover only written metadata ...
| * nvme: validate cntlid during controller initialisationChristoph Hellwig2019-05-141-17/+24
| | | | | | | | | | | | | | | | | | | | | | | | The CNTLID value is required to be unique, and we do rely on this for correct operation. So reject any controller for which a non-unique CNTLID has been detected. Based on a patch from Hannes Reinecke. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com>
| * nvme: change locking for the per-subsystem controller listChristoph Hellwig2019-05-141-18/+14
| | | | | | | | | | | | | | | | | | | | | | Life becomes a lot simpler if we just use the global nvme_subsystems_lock to protect this list. Given that it is only accessed during controller probing and removal that isn't a scalability problem either. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com>
| * nvme: trace all async notice eventsChaitanya Kulkarni2019-05-142-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the tracing of the NVMe Async events out of the switch so that it can trace all the events including the ones which are not handled in the nvme_handle_aen_notice(). The events which are not handled in the nvme_handle_aen_notice() such as NVME_AER_NOTICE_DISC_CHANGED corresponding event identifier needs to be added in the drivers/nvme/host/trace.h so that it can stringify the AER . Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvme-fabrics: remove unused argumentMinwoo Im2019-05-141-2/+2
| | | | | | | | | | | | | | | | | | | | The variable 'count' is not currently used by nvmf_create_ctrl(), so remove it. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvme-multipath: avoid crash on invalid subsystem cntlid enumerationHannes Reinecke2019-05-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | A process holding an open reference to a removed disk prevents it from completing deletion, so its name continues to exist. A subsequent gendisk creation may have the same cntlid which risks collision when using that for the name. Use the unique ctrl->instance instead. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvme-fc: use separate work queue to avoid warningHannes Reinecke2019-05-131-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When tearing down a controller the following warning is issued: WARNING: CPU: 0 PID: 30681 at ../kernel/workqueue.c:2418 check_flush_dependency This happens as the err_work workqueue item is scheduled on the system workqueue (which has WQ_MEM_RECLAIM not set), but is flushed from a workqueue which has WQ_MEM_RECLAIM set. Fix this by providing an FC-NVMe specific workqueue. Fixes: 4cff280a5fcc ("nvme-fc: resolve io failures during connect") Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvme-rdma: remove redundant reference between ib_device and tagsetMax Gurtovoy2019-05-131-29/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the past, before adding f41725bb ("nvme-rdma: Use mr pool") commit, we needed a reference on the ib_device as long as the tagset was alive, as the MRs in the request structures needed a valid ib_device. Now, we allocate/deallocate MR pool per QP and consume on demand. Also remove nvme_rdma_free_tagset function and use blk_mq_free_tag_set instead, as it unneeded anymore. This commit also fixes a memory leakage and possible segmentation fault. When configuring the system with NIC teaming (aka bonding), we use 1 network interface to create an HA connection to the target side. In case one connection breaks down, nvme-rdma driver will get notification from rdma-cm layer that underlying address was change and will start error recovery process. During this process, we'll reconnect to the target via the second interface in the bond without destroying the tagset. This will cause a leakage of the initial rdma device (ndev) and miscount in the reference count of the new created rdma device (new ndev). In the final destruction (or in another error flow), we'll get a warning dump from the ib_dealloc_pd that we still have inflight MR's related to that pd. This happens becasue of the miscount of the reference tag of the rdma device and causing access violation to it's elements (some queues are not destroyed yet). Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvme-pci: mark expected switch fall-throughGustavo A. R. Silva2019-05-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warning: drivers/nvme/host/pci.c: In function ‘nvme_timeout’: drivers/nvme/host/pci.c:1298:12: warning: this statement may fall through [-Wimplicit-fallthrough=] shutdown = true; ~~~~~~~~~^~~~~~ drivers/nvme/host/pci.c:1299:2: note: here case NVME_CTRL_CONNECTING: ^~~~ Warning level 3 was used: -Wimplicit-fallthrough=3 This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvme-pci: add known admin effects to augument admin effects log pageMaxim Levitsky2019-05-131-2/+1
| | | | | | | | | | | | | | | | | | Add known admin effects even if hardware has known admin effects page, since hardware can't be ever trusted to report sane values. (on my Intel DC P3700, it reports no side effects for namespace format) Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvme-pci: init shadow doorbell after each resetMaxim Levitsky2019-05-131-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The spec states: "The settings are not retained across a Controller Level Reset" Therefore the driver must enable the shadow doorbell, after each reset. This was caught while testing the nvme driver over upcoming nvme-mdev device. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Minwoo Im <minwoo.im@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * lightnvm: Inherit mdts from the parent nvme deviceIgor Konopko2019-05-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current lightnvm and pblk implementation does not care about NVMe max data transfer size, which can be smaller than 64*K=256K. There are existing NVMe controllers which NVMe max data transfer size is lower that 256K (for example 128K, which happens for existing NVMe controllers which are NVMe spec compliant). Such a controllers are not able to handle command which contains 64 PPAs, since the the size of DMAed buffer will be above the capabilities of such a controller. Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Reviewed-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@javigon.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2019-05-081-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: qla2xxx, qedf, smartpqi, hpsa, lpfc, ufs, mpt3sas, ibmvfc and hisi_sas. Plus number of minor changes, spelling fixes and other trivia" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (298 commits) scsi: qla2xxx: Avoid that lockdep complains about unsafe locking in tcm_qla2xxx_close_session() scsi: qla2xxx: Avoid that qlt_send_resp_ctio() corrupts memory scsi: qla2xxx: Fix hardirq-unsafe locking scsi: qla2xxx: Complain loudly about reference count underflow scsi: qla2xxx: Use __le64 instead of uint32_t[2] for sending DMA addresses to firmware scsi: qla2xxx: Introduce the dsd32 and dsd64 data structures scsi: qla2xxx: Check the size of firmware data structures at compile time scsi: qla2xxx: Pass little-endian values to the firmware scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands scsi: qla2xxx: Use an on-stack completion in qla24xx_control_vp() scsi: qla2xxx: Make qla24xx_async_abort_cmd() static scsi: qla2xxx: Remove unnecessary locking from the target code scsi: qla2xxx: Remove qla_tgt_cmd.released scsi: qla2xxx: Complain if a command is released that is owned by the firmware scsi: qla2xxx: target: Fix offline port handling and host reset handling scsi: qla2xxx: Fix abort handling in tcm_qla2xxx_write_pending() scsi: qla2xxx: Fix error handling in qlt_alloc_qfull_cmd() scsi: qla2xxx: Simplify qlt_send_term_imm_notif() scsi: qla2xxx: Fix use-after-free issues in qla2xxx_qpair_sp_free_dma() scsi: qla2xxx: Fix a qla24xx_enable_msix() error path ...
| * | scsi: scsi_transport_fc: nvme: display FC-NVMe port rolesHannes Reinecke2019-04-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the FC-NVMe driver is leverating the SCSI FC transport class to access the remote ports. Which means that all FC-NVMe remote ports will be visible to the fc transport layer, but due to missing definitions the port roles will always be 'unknown'. This patch adds the missing definitions to the fc transport class to that the port roles are correctly displayed. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Giridhar Malavali <gmalavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | Merge tag 'for-5.2/block-20190507' of git://git.kernel.dk/linux-blockLinus Torvalds2019-05-077-181/+208
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block updates from Jens Axboe: "Nothing major in this series, just fixes and improvements all over the map. This contains: - Series of fixes for sed-opal (David, Jonas) - Fixes and performance tweaks for BFQ (via Paolo) - Set of fixes for bcache (via Coly) - Set of fixes for md (via Song) - Enabling multi-page for passthrough requests (Ming) - Queue release fix series (Ming) - Device notification improvements (Martin) - Propagate underlying device rotational status in loop (Holger) - Removal of mtip32xx trim support, which has been disabled for years (Christoph) - Improvement and cleanup of nvme command handling (Christoph) - Add block SPDX tags (Christoph) - Cleanup/hardening of bio/bvec iteration (Christoph) - A few NVMe pull requests (Christoph) - Removal of CONFIG_LBDAF (Christoph) - Various little fixes here and there" * tag 'for-5.2/block-20190507' of git://git.kernel.dk/linux-block: (164 commits) block: fix mismerge in bvec_advance block: don't drain in-progress dispatch in blk_cleanup_queue() blk-mq: move cancel of hctx->run_work into blk_mq_hw_sysfs_release blk-mq: always free hctx after request queue is freed blk-mq: split blk_mq_alloc_and_init_hctx into two parts blk-mq: free hw queue's resource in hctx's release handler blk-mq: move cancel of requeue_work into blk_mq_release blk-mq: grab .q_usage_counter when queuing request from plug code path block: fix function name in comment nvmet: protect discovery change log event list iteration nvme: mark nvme_core_init and nvme_core_exit static nvme: move command size checks to the core nvme-fabrics: check more command sizes nvme-pci: check more command sizes nvme-pci: remove an unneeded variable initialization nvme-pci: unquiesce admin queue on shutdown nvme-pci: shutdown on timeout during deletion nvme-pci: fix psdt field for single segment sgls nvme-multipath: don't print ANA group state by default nvme-multipath: split bios with the ns_head bio_set before submitting ...
| * | nvme: mark nvme_core_init and nvme_core_exit staticChristoph Hellwig2019-05-012-5/+2
| | | | | | | | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
| * | nvme: move command size checks to the coreChristoph Hellwig2019-05-012-28/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most command aren't PCIe specific, so move the size checking for them to core.c Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
| * | nvme-fabrics: check more command sizesMinwoo Im2019-05-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct common_command provides a common structure for NVMe-oF command format. It also needs to be checked for unintended size growth. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | nvme-pci: check more command sizesMinwoo Im2019-05-011-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All the NVMe command has 64bytes fixed size so that it has been assured with BUILD_BUG_ON(). The remaining command structures in linux/nvme.h also need to be checked here. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | nvme-pci: remove an unneeded variable initializationMinwoo Im2019-05-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Variable "n" will be assigned once kstrtoint() succeeds, otherwise it will not be referred because kstrtoint() will return an error which means go out from this function. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | nvme-pci: unquiesce admin queue on shutdownKeith Busch2019-05-011-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just like IO queues, the admin queue also will not be restarted after a controller shutdown. Unquiesce this queue so that we do not block request dispatch on a permanently disabled controller. Reported-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | nvme-pci: shutdown on timeout during deletionKeith Busch2019-05-011-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do not restart a controller in a deleting state for timeout errors. When in this state, unblock potential request dispatchers with failed completions by shutting down the controller on timeout detection. Reported-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | nvme-pci: fix psdt field for single segment sglsKlaus Birkelund Jensen2019-05-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | The shortcut for single segment SGL requests did not set the PSDT field to mark the request as using SGLs. Fixes: 297910571f08 ("nvme-pci: optimize mapping single segment requests using SGLs") Signed-off-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | nvme-multipath: don't print ANA group state by defaultHannes Reinecke2019-05-011-1/+1
| | | | | | | | | | | | | | | | | | | | | Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | nvme-multipath: split bios with the ns_head bio_set before submittingHannes Reinecke2019-05-011-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the bio is moved to a different queue via blk_steal_bios() and the original queue is destroyed in nvme_remove_ns() we'll be ending with a crash in bio_endio() as the mempool for the split bio bvecs had already been destroyed. So split the bio using the original queue (which will remain during the lifetime of the bio) before sending it down to the underlying device. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | nvme-tcp: fix possible null deref on a timed out io queue connectSagi Grimberg2019-05-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | If I/O queue connect times out, we might have freed the queue socket already, so check for that on the error path in nvme_tcp_start_queue. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | nvme: set 0 capacity if namespace block size exceeds PAGE_SIZESagi Grimberg2019-04-251-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If our target exposed a namespace with a block size that is greater than PAGE_SIZE, set 0 capacity on the namespace as we do not support it. This issue encountered when the nvmet namespace was backed by a tempfile. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | nvme-tcp: rename function to have nvme_tcp prefixSagi Grimberg2019-04-251-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | usually nvme_ prefix is for core functions. While we're cleaning up, remove redundant empty lines Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Minwoo Im <minwoo.im@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | nvme-rdma: fix a NULL deref when an admin connect times outSagi Grimberg2019-04-251-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we timeout the admin startup sequence we might not yet have an I/O tagset allocated which causes the teardown sequence to crash. Make nvme_tcp_teardown_io_queues safe by not iterating inflight tags if the tagset wasn't allocated. Fixes: 4c174e636674 ("nvme-rdma: fix timeout handler") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | nvme-tcp: fix a NULL deref when an admin connect times outSagi Grimberg2019-04-251-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we timeout the admin startup sequence we might not yet have an I/O tagset allocated which causes the teardown sequence to crash. Make nvme_tcp_teardown_io_queues safe by not iterating inflight tags if the tagset wasn't allocated. Fixes: 39d57757467b ("nvme-tcp: fix timeout handler") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | Merge tag 'v5.1-rc6' into for-5.2/blockJens Axboe2019-04-222-6/+16
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull in v5.1-rc6 to resolve two conflicts. One is in BFQ, in just a comment, and is trivial. The other one is a conflict due to a later fix in the bio multi-page work, and needs a bit more care. * tag 'v5.1-rc6': (770 commits) Linux 5.1-rc6 block: make sure that bvec length can't be overflow block: kill all_q_node in request_queue x86/cpu/intel: Lower the "ENERGY_PERF_BIAS: Set to normal" message's log priority coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping mm/kmemleak.c: fix unused-function warning init: initialize jump labels before command line option parsing kernel/watchdog_hld.c: hard lockup message should end with a newline kcov: improve CONFIG_ARCH_HAS_KCOV help text mm: fix inactive list balancing between NUMA nodes and cgroups mm/hotplug: treat CMA pages as unmovable proc: fixup proc-pid-vm test proc: fix map_files test on F29 mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lock mm: swapoff: shmem_unuse() stop eviction without igrab() mm: swapoff: take notice of completion sooner mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIES mm: swapoff: shmem_find_swap_entries() filter out other types slab: store tagged freelist for off-slab slabmgmt ... Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | nvme: log the error status on Identify Namespace failureKenneth Heitke2019-04-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Identify Namespace failures are logged as a warning but there is not an indication of the cause for the failure. Update the log message to include the error status. Signed-off-by: Kenneth Heitke <kenneth.heitke@intel.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | nvme-pci: tidy up nvme_map_dataChristoph Hellwig2019-04-051-12/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove two pointless local variables, remove ret assignment that is never used, move the use_sgl initialization closer to where it is used. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
| * | | nvme-pci: optimize mapping single segment requests using SGLsChristoph Hellwig2019-04-051-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the controller supports SGLs we can take another short cut for single segment request, given that we can always map those without another indirection structure, and thus don't need to create a scatterlist structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
| * | | nvme-pci: optimize mapping of small single segment requestsChristoph Hellwig2019-04-051-5/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a request is single segment and fits into one or two PRP entries we do not have to create a scatterlist for it, but can just map the bio_vec directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
| * | | nvme-pci: remove the inline scatterlist optimizationChristoph Hellwig2019-04-051-32/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We'll have a better way to optimize for small I/O that doesn't require it soon, so remove the existing inline_sg case to make that optimization easier to implement. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>