summaryrefslogtreecommitdiffstats
path: root/drivers/nvme
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2022-05-262-4/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull rdma updates from Jason Gunthorpe: "Small collection of incremental improvement patches: - Minor code cleanup patches, comment improvements, etc from static tools - Clean the some of the kernel caps, reducing the historical stealth uAPI leftovers - Bug fixes and minor changes for rdmavt, hns, rxe, irdma - Remove unimplemented cruft from rxe - Reorganize UMR QP code in mlx5 to avoid going through the IB verbs layer - flush_workqueue(system_unbound_wq) removal - Ensure rxe waits for objects to be unused before allowing the core to free them - Several rc quality bug fixes for hfi1" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (67 commits) RDMA/rtrs-clt: Fix one kernel-doc comment RDMA/hfi1: Remove all traces of diagpkt support RDMA/hfi1: Consolidate software versions RDMA/hfi1: Remove pointless driver version RDMA/hfi1: Fix potential integer multiplication overflow errors RDMA/hfi1: Prevent panic when SDMA is disabled RDMA/hfi1: Prevent use of lock before it is initialized RDMA/rxe: Fix an error handling path in rxe_get_mcg() IB/core: Fix typo in comment RDMA/core: Fix typo in comment IB/hf1: Fix typo in comment IB/qib: Fix typo in comment IB/iser: Fix typo in comment RDMA/mlx4: Avoid flush_scheduled_work() usage IB/isert: Avoid flush_scheduled_work() usage RDMA/mlx5: Remove duplicate pointer assignment in mlx5_ib_alloc_implicit_mr() RDMA/qedr: Remove unnecessary synchronize_irq() before free_irq() RDMA/hns: Use hr_reg_read() instead of remaining roce_get_xxx() RDMA/hns: Use hr_reg_xxx() instead of remaining roce_set_xxx() RDMA/irdma: Add SW mechanism to generate completions on error ...
| * Merge tag 'v5.18' into rdma.git for-nextJason Gunthorpe2022-05-243-8/+33
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following patches have dependencies. Resolve the merge conflict in drivers/net/ethernet/mellanox/mlx5/core/main.c by keeping the new names for the fs functions following linux-next: https://lore.kernel.org/r/20220519113529.226bc3e2@canb.auug.org.au/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | RDMA: Split kernel-only global device caps from uverbs device capsJason Gunthorpe2022-04-062-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split out flags from ib_device::device_cap_flags that are only used internally to the kernel into kernel_cap_flags that is not part of the uapi. This limits the device_cap_flags to being the same bitmap that will be copied to userspace. This cleanly splits out the uverbs flags from the kernel flags to avoid confusion in the flags bitmap. Add some short comments describing which each of the kernel flags is connected to. Remove unused kernel flags. Link: https://lore.kernel.org/r/0-v2-22c19e565eef+139a-kern_caps_jgg@nvidia.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | | Merge tag 'arm-drivers-5.19' of ↵Linus Torvalds2022-05-263-0/+1609
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM driver updates from Arnd Bergmann: "There are minor updates to SoC specific drivers for chips by Rockchip, Samsung, NVIDIA, TI, NXP, i.MX, Qualcomm, and Broadcom. Noteworthy driver changes include: - Several conversions of DT bindings to yaml format. - Renesas adds driver support for R-Car V4H, RZ/V2M and RZ/G2UL SoCs. - Qualcomm adds a bus driver for the SSC (Snapdragon Sensor Core), and support for more chips in the RPMh power domains and the soc-id. - NXP has a new driver for the HDMI blk-ctrl on i.MX8MP. - Apple M1 gains support for the on-chip NVMe controller, making it possible to finally use the internal disks. This also includes SoC drivers for their RTKit IPC and for the SART DMA address filter. For other subsystems that merge their drivers through the SoC tree, we have - Firmware drivers for the ARM firmware stack including TEE, OP-TEE, SCMI and FF-A get a number of smaller updates and cleanups. OP-TEE now has a cache for firmware argument structures as an optimization, and SCMI now supports the 3.1 version of the specification. - Reset controller updates to Amlogic, ASpeed, Renesas and ACPI drivers - Memory controller updates for Tegra, and a few updates for other platforms" * tag 'arm-drivers-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (159 commits) memory: tegra: Add MC error logging on Tegra186 onward memory: tegra: Add memory controller channels support memory: tegra: Add APE memory clients for Tegra234 memory: tegra: Add Tegra234 support nvme-apple: fix sparse endianess warnings soc/tegra: pmc: Document core domain fields soc: qcom: pdr: use static for servreg_* variables soc: imx: fix semicolon.cocci warnings soc: renesas: R-Car V3U is R-Car Gen4 soc: imx: add i.MX8MP HDMI blk-ctrl soc: imx: imx8m-blk-ctrl: Add i.MX8MP media blk-ctrl soc: imx: add i.MX8MP HSIO blk-ctrl soc: imx: imx8m-blk-ctrl: set power device name soc: qcom: llcc: Add sc8180x and sc8280xp configurations dt-bindings: arm: msm: Add sc8180x and sc8280xp LLCC compatibles soc/tegra: pmc: Select REGMAP dt-bindings: reset: st,sti-powerdown: Convert to yaml dt-bindings: reset: st,sti-picophyreset: Convert to yaml dt-bindings: reset: socfpga: Convert to yaml dt-bindings: reset: snps,axs10x-reset: Convert to yaml ...
| * | | nvme-apple: fix sparse endianess warningsArnd Bergmann2022-05-061-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new nvme-apple driver is missing a few conversions to and from little-endian data: drivers/nvme/host/apple.c:291:19: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned long long [usertype] prp1 @@ got restricted __le64 [usertype] prp1 @@ drivers/nvme/host/apple.c:291:19: sparse: expected unsigned long long [usertype] prp1 drivers/nvme/host/apple.c:291:19: sparse: got restricted __le64 [usertype] prp1 drivers/nvme/host/apple.c:292:19: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned long long [usertype] prp2 @@ got restricted __le64 [usertype] prp2 @@ drivers/nvme/host/apple.c:293:21: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] length @@ got restricted __le16 [usertype] length @@ drivers/nvme/host/apple.c:351:52: sparse: sparse: incorrect type in initializer (different base types) @@ expected unsigned int [usertype] next_dma_addr @@ got restricted __le64 [usertype] @@ drivers/nvme/host/apple.c:456:45: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted __le64 [usertype] @@ got unsigned int [addressable] [usertype] prp_dma @@ drivers/nvme/host/apple.c:459:31: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted __le64 [usertype] @@ got unsigned long long [assigned] [usertype] dma_addr @@ drivers/nvme/host/apple.c:474:25: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted __le64 [usertype] prp1 @@ got unsigned int [usertype] dma_address @@ drivers/nvme/host/apple.c:475:25: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted __le64 [usertype] prp2 @@ got unsigned int [usertype] first_dma @@ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| * | | Merge tag 'asahi-soc-rtkit-sart-nvme-for-5.19' of ↵Arnd Bergmann2022-05-053-0/+1609
| |\ \ \ | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://github.com/AsahiLinux/linux into arm/drivers Apple SoC NVMe driver and dependencies: - RTKit IPC library required to boot and communicate with co-processors embedded inside Apple SoCs - SART DMA address filter required to allow some DMA transactions for the NVMe co-processor - NVMe platform driver The following minor changes since v3 on the mailing list have been folded in: - sart: %llx -> %pa for a phys_addr_t - rtkit:/sart: Drop IS_ENABLED inside headers - rtkit: Use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL - nvme: Set NVME_REQ_CANCELLED in the timeout handler - nvme: Use DEFINE_SIMPLE_DEV_PM_OPS instead of #ifdef CONFIG_PM_SLEEP * tag 'asahi-soc-rtkit-sart-nvme-for-5.19' of https://github.com/AsahiLinux/linux: nvme-apple: Add initial Apple SoC NVMe driver dt-bindings: nvme: Add Apple ANS NVMe soc: apple: Add SART driver dt-bindings: iommu: Add Apple SART DMA address filter soc: apple: Add RTKit IPC library soc: apple: Always include Makefile Link: https://lore.kernel.org/r/20220505154020.84638-1-sven@svenpeter.dev Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| | * | nvme-apple: Add initial Apple SoC NVMe driverSven Peter2022-05-023-0/+1609
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Apple SoCs such as the M1 come with an embedded NVMe controller that is not attached to any PCIe bus. Additionally, it doesn't conform to the NVMe specification and requires a bunch of changes to command submission and IOMMU configuration to work. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sven Peter <sven@svenpeter.dev>
* | | Merge tag 'for-5.19/drivers-2022-05-22' of git://git.kernel.dk/linux-blockLinus Torvalds2022-05-237-24/+110
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block driver updates from Jens Axboe: "Here are the driver updates queued up for 5.19. This contains: - NVMe pull requests via Christoph: - tighten the PCI presence check (Stefan Roese) - fix a potential NULL pointer dereference in an error path (Kyle Miller Smith) - fix interpretation of the DMRSL field (Tom Yan) - relax the data transfer alignment (Keith Busch) - verbose error logging improvements (Max Gurtovoy, Chaitanya Kulkarni) - misc cleanups (Chaitanya Kulkarni, Christoph) - set non-mdts limits in nvme_scan_work (Chaitanya Kulkarni) - add support for TP4084 - Time-to-Ready Enhancements (Christoph) - MD pull request via Song: - Improve annotation in raid5 code, by Logan Gunthorpe - Support MD_BROKEN flag in raid-1/5/10, by Mariusz Tkaczyk - Other small fixes/cleanups - null_blk series making the configfs side much saner (Damien) - Various minor drbd cleanups and fixes (Haowen, Uladzislau, Jiapeng, Arnd, Cai) - Avoid using the system workqueue (and hence flushing it) in rnbd (Jack) - Avoid using the system workqueue (and hence flushing it) in aoe (Tetsuo) - Series fixing discard_alignment issues in drivers (Christoph) - Small series fixing drivers poking at disk->part0 for openers information (Christoph) - Series fixing deadlocks in loop (Christoph, Tetsuo) - Remove loop.h and add SPDX headers (Christoph) - Various fixes and cleanups (Julia, Xie, Yu)" * tag 'for-5.19/drivers-2022-05-22' of git://git.kernel.dk/linux-block: (72 commits) mtip32xx: fix typo in comment nvme: set non-mdts limits in nvme_scan_work nvme: add support for TP4084 - Time-to-Ready Enhancements nvme: split the enum used for various register constants nbd: Fix hung on disconnect request if socket is closed before nvme-fabrics: add a request timeout helper nvme-pci: harden drive presence detect in nvme_dev_disable() nvme-pci: fix a NULL pointer dereference in nvme_alloc_admin_tags nvme: mark internal passthru request RQF_QUIET nvme: remove unneeded include from constants file nvme: add missing status values to verbose logging nvme: set dma alignment to dword nvme: fix interpretation of DMRSL loop: remove most the top-of-file boilerplate comment from the UAPI header loop: remove most the top-of-file boilerplate comment loop: add a SPDX header loop: remove loop.h block: null_blk: Improve device creation with configfs block: null_blk: Cleanup messages block: null_blk: Cleanup device creation and deletion ...
| * | | nvme: set non-mdts limits in nvme_scan_workChaitanya Kulkarni2022-05-191-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In current implementation we set the non-mdts limits by calling nvme_init_non_mdts_limits() from nvme_init_ctrl_finish(). This also tries to set the limits for the discovery controller which has no I/O queues resulting in the warning message reported by the nvme_log_error() when running blktest nvme/002: - [ 2005.155946] run blktests nvme/002 at 2022-04-09 16:57:47 [ 2005.192223] loop: module loaded [ 2005.196429] nvmet: adding nsid 1 to subsystem blktests-subsystem-0 [ 2005.200334] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 <------------------------------SNIP----------------------------------> [ 2008.958108] nvmet: adding nsid 1 to subsystem blktests-subsystem-997 [ 2008.962082] nvmet: adding nsid 1 to subsystem blktests-subsystem-998 [ 2008.966102] nvmet: adding nsid 1 to subsystem blktests-subsystem-999 [ 2008.973132] nvmet: creating discovery controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN testhostnqn. *[ 2008.973196] nvme1: Identify(0x6), Invalid Field in Command (sct 0x0 / sc 0x2) MORE DNR* [ 2008.974595] nvme nvme1: new ctrl: "nqn.2014-08.org.nvmexpress.discovery" [ 2009.103248] nvme nvme1: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery" Move the call of nvme_init_non_mdts_limits() to nvme_scan_work() after we verify that I/O queues are created since that is a converging point for each transport where these limits are actually used. 1. FC : nvme_fc_create_association() ... nvme_fc_create_io_queues(ctrl); ... nvme_start_ctrl() nvme_scan_queue() nvme_scan_work() 2. PCIe:- nvme_reset_work() ... nvme_setup_io_queues() nvme_create_io_queues() nvme_alloc_queue() ... nvme_start_ctrl() nvme_scan_queue() nvme_scan_work() 3. RDMA :- nvme_rdma_setup_ctrl ... nvme_rdma_configure_io_queues ... nvme_start_ctrl() nvme_scan_queue() nvme_scan_work() 4. TCP :- nvme_tcp_setup_ctrl ... nvme_tcp_configure_io_queues ... nvme_start_ctrl() nvme_scan_queue() nvme_scan_work() * nvme_scan_work() ... nvme_validate_or_alloc_ns() nvme_alloc_ns() nvme_update_ns_info() nvme_update_disk_info() nvme_config_discard() <--- blk_queue_max_write_zeroes_sectors() <--- Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | nvme: add support for TP4084 - Time-to-Ready EnhancementsChristoph Hellwig2022-05-182-6/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for using longer timeouts during controller initialization and letting the controller come up with namespaces that are not ready for I/O yet. We skip these not ready namespaces during scanning and only bring them online once anoter scan is kicked off by the AEN that is set when the NRDY bit gets set in the I/O Command Set Independent Identify Namespace Data Structure. This asynchronous probing avoids blocking the kernel boot when controllers take a very long time to recover after unclean shutdowns (up to minutes). Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
| * | | nvme-fabrics: add a request timeout helperChaitanya Kulkarni2022-05-163-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The RDAMA and TCP transport both complete the timed out request in the same manner and hence code is duplicated. Add and use the helper nvmf_complete_timed_out_request() to remove the duplicate code. Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | nvme-pci: harden drive presence detect in nvme_dev_disable()Stefan Roese2022-05-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On our ZynqMP system we observe, that a NVMe drive that resets itself while doing a firmware update causes a Kernel crash like this: [ 67.720772] pcieport 0000:02:02.0: pciehp: Slot(2): Link Down [ 67.720783] pcieport 0000:02:02.0: pciehp: Slot(2): Card not present [ 67.720795] nvme 0000:04:00.0: PME# disabled [ 67.720849] Internal error: synchronous external abort: 96000010 [#1] PREEMPT SMP [ 67.720853] nwl-pcie fd0e0000.pcie: Slave error Analysis: When nvme_dev_disable() is called because of this PCIe hotplug event, pci_is_enabled() is still true. And accessing the NVMe drive which is currently not available as it's in reboot process causes this "synchronous external abort" on this ARM64 platform. This patch adds the pci_device_is_present() check as well, which returns false in this "Card not present" hot-plug case. With this change, the NVMe driver does not try to access the NVMe registers any more and the FW update finishes without any problems. Signed-off-by: Stefan Roese <sr@denx.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | nvme-pci: fix a NULL pointer dereference in nvme_alloc_admin_tagsSmith, Kyle Miller (Nimble Kernel)2022-05-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In nvme_alloc_admin_tags, the admin_q can be set to an error (typically -ENOMEM) if the blk_mq_init_queue call fails to set up the queue, which is checked immediately after the call. However, when we return the error message up the stack, to nvme_reset_work the error takes us to nvme_remove_dead_ctrl() nvme_dev_disable() nvme_suspend_queue(&dev->queues[0]). Here, we only check that the admin_q is non-NULL, rather than not an error or NULL, and begin quiescing a queue that never existed, leading to bad / NULL pointer dereference. Signed-off-by: Kyle Smith <kyles@hpe.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | nvme: mark internal passthru request RQF_QUIETChaitanya Kulkarni2022-05-162-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most of the internal passthru commands use __nvme_submit_sync_cmd() interface. There are few places we open code the request submission :- 1. nvme_keep_alive_work(struct work_struct *work) 2. nvme_timeout(struct request *req, bool reserved) 3. nvme_delete_queue(struct nvme_queue *nvmeq, u8 opcode) Mark the internal passthru request quiet so that we can skip the verbose error message from nvme_log_error() in nvme_end_req() completion path, this will be consistent with what we have in __nvme_submit_sync_cmd(). Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Alan Adamson <alan.adamson@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | nvme: remove unneeded include from constants fileMax Gurtovoy2022-05-161-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | No usage of blkdev.h elements. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | nvme: add missing status values to verbose loggingMax Gurtovoy2022-05-161-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Log a few more path related status codes. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | nvme: set dma alignment to dwordKeith Busch2022-05-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nvme specification only requires qword alignment for segment descriptors, and the driver already guarantees that. The spec has always allowed user data to be dword aligned, which is what the queue's attribute is for, so relax the alignment requirement to that value. While we could allow byte alignment for some controllers when using SGLs, we still need to support PRP, and that only allows dword. Fixes: 3b2a1ebceba3 ("nvme: set dma alignment to qword") Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | nvme: fix interpretation of DMRSLTom Yan2022-05-162-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DMRSLl is in the unit of logical blocks, while max_discard_sectors is in the unit of "linux sector". Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | nvme: remove a spurious clear of discard_alignmentChristoph Hellwig2022-05-031-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nvme driver never sets a discard_alignment, so it also doens't need to clear it to zero. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220418045314.360785-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | | Merge tag 'for-5.19/block-2022-05-22' of git://git.kernel.dk/linux-blockLinus Torvalds2022-05-234-22/+13
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block updates from Jens Axboe: "Here are the core block changes for 5.19. This contains: - blk-throttle accounting fix (Laibin) - Series removing redundant assignments (Michal) - Expose bio cache via the bio_set, so that DM can use it (Mike) - Finish off the bio allocation interface cleanups by dealing with the weirdest member of the family. bio_kmalloc combines a kmalloc for the bio and bio_vecs with a hidden bio_init call and magic cleanup semantics (Christoph) - Clean up the block layer API so that APIs consumed by file systems are (almost) only struct block_device based, so that file systems don't have to poke into block layer internals like the request_queue (Christoph) - Clean up the blk_execute_rq* API (Christoph) - Clean up various lose end in the blk-cgroup code to make it easier to follow in preparation of reworking the blkcg assignment for bios (Christoph) - Fix use-after-free issues in BFQ when processes with merged queues get moved to different cgroups (Jan) - BFQ fixes (Jan) - Various fixes and cleanups (Bart, Chengming, Fanjun, Julia, Ming, Wolfgang, me)" * tag 'for-5.19/block-2022-05-22' of git://git.kernel.dk/linux-block: (83 commits) blk-mq: fix typo in comment bfq: Remove bfq_requeue_request_body() bfq: Remove superfluous conversion from RQ_BIC() bfq: Allow current waker to defend against a tentative one bfq: Relax waker detection for shared queues blk-cgroup: delete rcu_read_lock_held() WARN_ON_ONCE() blk-throttle: Set BIO_THROTTLED when bio has been throttled blk-cgroup: Remove unnecessary rcu_read_lock/unlock() blk-cgroup: always terminate io.stat lines block, bfq: make bfq_has_work() more accurate block, bfq: protect 'bfqd->queued' by 'bfqd->lock' block: cleanup the VM accounting in submit_bio block: Fix the bio.bi_opf comment block: reorder the REQ_ flags blk-iocost: combine local_stat and desc_stat to stat block: improve the error message from bio_check_eod block: allow passing a NULL bdev to bio_alloc_clone/bio_init_clone block: remove superfluous calls to blkcg_bio_issue_init kthread: unexport kthread_blkcg blk-cgroup: cleanup blkcg_maybe_throttle_current ...
| * | | | nvme-fc: fold t fc_update_appid into fc_appid_storeChristoph Hellwig2022-05-021-16/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No need for this wrapper. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220420042723.1010598-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | | nvme-fc: don't support the appid attribute without CONFIG_BLK_CGROUP_FC_APPIDChristoph Hellwig2022-05-021-1/+6
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nvme-fc appid support needs CONFIG_BLK_CGROUP_FC_APPID to work, so disable the whole code if the option is not set. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220420042723.1010598-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | block: decouple REQ_OP_SECURE_ERASE from REQ_OP_DISCARDChristoph Hellwig2022-04-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Secure erase is a very different operation from discard in that it is a data integrity operation vs hint. Fully split the limits and helper infrastructure to make the separation more clear. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nifs2] Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> [f2fs] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Acked-by: Chao Yu <chao@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-27-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | block: remove QUEUE_FLAG_DISCARDChristoph Hellwig2022-04-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just use a non-zero max_discard_sectors as an indicator for discard support, similar to what is done for write zeroes. The only places where needs special attention is the RAID5 driver, which must clear discard support for security reasons by default, even if the default stacking rules would allow for it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | block: add a bdev_max_zone_append_sectors helperChristoph Hellwig2022-04-171-2/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a helper to check the max supported sectors for zone append based on the block_device instead of having to poke into the block layer internal request_queue. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-16-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | nvme: enable uring-passthrough for admin commandsKanchan Joshi2022-05-203-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add two new opcodes that userspace can use for admin commands: NVME_URING_CMD_ADMIN : non-vectroed NVME_URING_CMD_ADMIN_VEC : vectored variant Wire up support when these are issued on controller node(/dev/nvmeX). Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220520090630.70394-3-joshi.k@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | nvme: helper for uring-passthrough checksKanchan Joshi2022-05-201-8/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Factor out a helper consolidating the error checks, and fix typo in a comment too. This is in preparation to support admin commands on this path. Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220520090630.70394-2-joshi.k@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | nvme: add vectored-io support for uring-cmdAnuj Gupta2022-05-111-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | wire up support for async passthru that takes an array of buffers (using iovec). Exposed via a new op NVME_URING_CMD_IO_VEC. Same 'struct nvme_uring_cmd' is to be used with - 1. cmd.addr as base address of user iovec array 2. cmd.data_len as count of iovec array elements Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220511054750.20432-6-joshi.k@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | nvme: wire-up uring-cmd support for io-passthru on char-device.Kanchan Joshi2022-05-114-3/+195
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce handler for fops->uring_cmd(), implementing async passthru on char device (/dev/ngX). The handler supports newly introduced operation NVME_URING_CMD_IO. This operates on a new structure nvme_uring_cmd, which is similar to struct nvme_passthru_cmd64 but without the embedded 8b result field. This field is not needed since uring-cmd allows to return additional result via big-CQE. Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220511054750.20432-5-joshi.k@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | nvme: refactor nvme_submit_user_cmd()Christoph Hellwig2022-05-111-11/+45
|/ / | | | | | | | | | | | | | | | | | | | | Divide the work into two helpers, namely nvme_alloc_user_request and nvme_execute_user_rq. This is a prep patch, to help wiring up uring-cmd support in nvme. Signed-off-by: Christoph Hellwig <hch@lst.de> [axboe: fold in fix for assuming bio is non-NULL] Link: https://lore.kernel.org/r/20220511054750.20432-4-joshi.k@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | nvme-pci: disable namespace identifiers for Qemu controllersChristoph Hellwig2022-04-151-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Qemu unconditionally reports a UUID, which depending on the qemu version is either all-null (which is incorrect but harmless) or contains a single bit set for all controllers. In addition it can also optionally report a eui64 which needs to be manually set. Disable namespace identifiers for Qemu controlles entirely even if in some cases they could be set correctly through manual intervention. Reported-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
* | nvme-pci: disable namespace identifiers for the MAXIO MAP1002/1202Christoph Hellwig2022-04-151-0/+4
| | | | | | | | | | | | | | | | | | | | | | The MAXIO MAP1002/1202 controllers reports completely bogus Namespace identifiers that even change after suspend cycles. Disable using the Identifiers entirely. Reported-by: 金韬 <me@kingtous.cn> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Tested-by: 金韬 <me@kingtous.cn>
* | nvme: add a quirk to disable namespace identifiersChristoph Hellwig2022-04-152-6/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a quirk to disable using and exporting namespace identifiers for controllers where they are broken beyond repair. The most directly visible problem with non-unique namespace identifiers is that they break the /dev/disk/by-id/ links, with the link for a supposedly unique identifier now pointing to one of multiple possible namespaces that share the same ID, and a somewhat random selection of which one actually shows up. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
* | nvme: don't print verbose errors for internal passthrough requestsChaitanya Kulkarni2022-04-151-1/+2
|/ | | | | | | | | | | | | | | | Use the RQF_QUIET flag to skip the newly added verbose error reporting, and set the flag in __nvme_submit_sync_cmd, which is used for most internal passthrough requests where we do expect errors (e.g. due to probing for optional functionality). This is similar to what the SCSI verbose error logging does. Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Alan Adamson <alan.adamson@oracle.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Alan Adamson <alan.adamson@oracle.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* Merge tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-blockLinus Torvalds2022-04-0115-57/+132
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block driver fixes from Jens Axboe: "Followup block driver updates and fixes for the 5.18-rc1 merge window. In detail: - NVMe pull request - Fix multipath hang when disk goes live over reconnect (Anton Eidelman) - fix RCU hole that allowed for endless looping in multipath round robin (Chris Leech) - remove redundant assignment after left shift (Colin Ian King) - add quirks for Samsung X5 SSDs (Monish Kumar R) - fix the read-only state for zoned namespaces with unsupposed features (Pankaj Raghav) - use a private workqueue instead of the system workqueue in nvmet (Sagi Grimberg) - allow duplicate NSIDs for private namespaces (Sungup Moon) - expose use_threaded_interrupts read-only in sysfs (Xin Hao)" - nbd minor allocation fix (Zhang) - drbd fixes and maintainer addition (Lars, Jakob, Christoph) - n64cart build fix (Jackie) - loop compat ioctl fix (Carlos) - misc fixes (Colin, Dongli)" * tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-block: drbd: remove check of list iterator against head past the loop body drbd: remove usage of list iterator variable after loop nbd: fix possible overflow on 'first_minor' in nbd_dev_add() MAINTAINERS: add drbd co-maintainer drbd: fix potential silent data corruption loop: fix ioctl calls using compat_loop_info nvme-multipath: fix hang when disk goes live over reconnect nvme: fix RCU hole that allowed for endless looping in multipath round robin nvme: allow duplicate NSIDs for private namespaces nvmet: remove redundant assignment after left shift nvmet: use a private workqueue instead of the system workqueue nvme-pci: add quirks for Samsung X5 SSDs nvme-pci: expose use_threaded_interrupts read-only in sysfs nvme: fix the read-only state for zoned namespaces with unsupposed features n64cart: convert bi_disk to bi_bdev->bd_disk fix build xen/blkfront: fix comment for need_copy xen-blkback: remove redundant assignment to variable i
| * nvme-multipath: fix hang when disk goes live over reconnectAnton Eidelman2022-03-293-2/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nvme_mpath_init_identify() invoked from nvme_init_identify() fetches a fresh ANA log from the ctrl. This is essential to have an up to date path states for both existing namespaces and for those scan_work may discover once the ctrl is up. This happens in the following cases: 1) A new ctrl is being connected. 2) An existing ctrl is successfully reconnected. 3) An existing ctrl is being reset. While in (1) ctrl->namespaces is empty, (2 & 3) may have namespaces, and nvme_read_ana_log() may call nvme_update_ns_ana_state(). This result in a hang when the ANA state of an existing namespace changes and makes the disk live: nvme_mpath_set_live() issues IO to the namespace through the ctrl, which does NOT have IO queues yet. See sample hang below. Solution: - nvme_update_ns_ana_state() to call set_live only if ctrl is live - nvme_read_ana_log() call from nvme_mpath_init_identify() therefore only fetches and parses the ANA log; any erros in this process will fail the ctrl setup as appropriate; - a separate function nvme_mpath_update() is called in nvme_start_ctrl(); this parses the ANA log without fetching it. At this point the ctrl is live, therefore, disks can be set live normally. Sample failure: nvme nvme0: starting error recovery nvme nvme0: Reconnecting in 10 seconds... block nvme0n6: no usable path - requeuing I/O INFO: task kworker/u8:3:312 blocked for more than 122 seconds. Tainted: G E 5.14.5-1.el7.elrepo.x86_64 #1 Workqueue: nvme-wq nvme_tcp_reconnect_ctrl_work [nvme_tcp] Call Trace: __schedule+0x2a2/0x7e0 schedule+0x4e/0xb0 io_schedule+0x16/0x40 wait_on_page_bit_common+0x15c/0x3e0 do_read_cache_page+0x1e0/0x410 read_cache_page+0x12/0x20 read_part_sector+0x46/0x100 read_lba+0x121/0x240 efi_partition+0x1d2/0x6a0 bdev_disk_changed.part.0+0x1df/0x430 bdev_disk_changed+0x18/0x20 blkdev_get_whole+0x77/0xe0 blkdev_get_by_dev+0xd2/0x3a0 __device_add_disk+0x1ed/0x310 device_add_disk+0x13/0x20 nvme_mpath_set_live+0x138/0x1b0 [nvme_core] nvme_update_ns_ana_state+0x2b/0x30 [nvme_core] nvme_update_ana_state+0xca/0xe0 [nvme_core] nvme_parse_ana_log+0xac/0x170 [nvme_core] nvme_read_ana_log+0x7d/0xe0 [nvme_core] nvme_mpath_init_identify+0x105/0x150 [nvme_core] nvme_init_identify+0x2df/0x4d0 [nvme_core] nvme_init_ctrl_finish+0x8d/0x3b0 [nvme_core] nvme_tcp_setup_ctrl+0x337/0x390 [nvme_tcp] nvme_tcp_reconnect_ctrl_work+0x24/0x40 [nvme_tcp] process_one_work+0x1bd/0x360 worker_thread+0x50/0x3d0 Signed-off-by: Anton Eidelman <anton@lightbitslabs.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvme: fix RCU hole that allowed for endless looping in multipath round robinChris Leech2022-03-291-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make nvme_ns_remove match the assumptions elsewhere. 1) !NVME_NS_READY needs to be srcu synchronized to make sure nothing is running in __nvme_find_path or nvme_round_robin_path that will re-assign this ns to current_path. 2) Any matching current_path entries need to be cleared before removing from the siblings list, to prevent calling nvme_round_robin_path with an "old" ns that's off list. 3) Finally the list_del_rcu can happen, and then synchronize again before releasing any reference counts. Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvme: allow duplicate NSIDs for private namespacesSungup Moon2022-03-293-8/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A NVMe subsystem with multiple controller can have private namespaces that use the same NSID under some conditions: "If Namespace Management, ANA Reporting, or NVM Sets are supported, the NSIDs shall be unique within the NVM subsystem. If the Namespace Management, ANA Reporting, and NVM Sets are not supported, then NSIDs: a) for shared namespace shall be unique; and b) for private namespace are not required to be unique." Reference: Section 6.1.6 NSID and Namespace Usage; NVM Express 1.4c spec. Make sure this specific setup is supported in Linux. Fixes: 9ad1927a3bc2 ("nvme: always search for namespace head") Signed-off-by: Sungup Moon <sungup.moon@samsung.com> [hch: refactored and fixed the controller vs subsystem based naming conflict] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
| * nvmet: remove redundant assignment after left shiftColin Ian King2022-03-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The left shift is followed by a re-assignment back to cc_css, the assignment is redundant. Fix this by replacing the "<<=" operator with "<<" instead. This cleans up the clang scan build warning: drivers/nvme/target/core.c:1124:10: warning: Although the value stored to 'cc_css' is used in the enclosing expression, the value is never actually read from 'cc_css' [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvmet: use a private workqueue instead of the system workqueueSagi Grimberg2022-03-2911-37/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Any attempt to flush kernel-global WQs has possibility of deadlock so we should simply stop using them, instead introduce nvmet_wq which is the generic nvmet workqueue for work elements that don't explicitly require a dedicated workqueue (by the mere fact that they are using the system_wq). Changes were done using the following replaces: - s/schedule_work(/queue_work(nvmet_wq, /g - s/schedule_delayed_work(/queue_delayed_work(nvmet_wq, /g - s/flush_scheduled_work()/flush_workqueue(nvmet_wq)/g Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvme-pci: add quirks for Samsung X5 SSDsMonish Kumar R2022-03-231-1/+4
| | | | | | | | | | | | | | | | Add quirks to not fail the initialization and to have quick resume latency after cold/warm reboot. Signed-off-by: Monish Kumar R <monish.kumar.r@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvme-pci: expose use_threaded_interrupts read-only in sysfsXin Hao2022-03-231-1/+1
| | | | | | | | | | | | | | | | Allow reading /sys/module/nvme/parameters/use_threaded_interrupts to see if the use_threaded_interrupts module parameter is in use. Signed-off-by: Xin Hao <xhao@linux.alibaba.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * nvme: fix the read-only state for zoned namespaces with unsupposed featuresPankaj Raghav2022-03-231-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2f4c9ba23b88 ("nvme: export zoned namespaces without Zone Append support read-only") marks zoned namespaces without append support read-only. It does iso by setting NVME_NS_FORCE_RO in ns->flags in nvme_update_zone_info and checking for that flag later in nvme_update_disk_info to mark the disk as read-only. But commit 73d90386b559 ("nvme: cleanup zone information initialization") rearranged nvme_update_disk_info to be called before nvme_update_zone_info and thus not marking the disk as read-only. The call order cannot be just reverted because nvme_update_zone_info sets certain queue parameters such as zone_write_granularity that depend on the prior call to nvme_update_disk_info. Remove the call to set_disk_ro in nvme_update_disk_info. and call set_disk_ro after nvme_update_zone_info and nvme_update_disk_info to set the permission for ZNS drives correctly. The same applies to the multipath disk path. Fixes: 73d90386b559 ("nvme: cleanup zone information initialization") Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* | Merge tag 'for-5.18/64bit-pi-2022-03-25' of git://git.kernel.dk/linux-blockLinus Torvalds2022-03-262-28/+141
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block layer 64-bit data integrity support from Jens Axboe: "This adds support for 64-bit data integrity in the block layer and in NVMe" * tag 'for-5.18/64bit-pi-2022-03-25' of git://git.kernel.dk/linux-block: crypto: fix crc64 testmgr digest byte order nvme: add support for enhanced metadata block: add pi for extended integrity crypto: add rocksoft 64b crc guard tag framework lib: add rocksoft model crc64 linux/kernel: introduce lower_48_bits function asm-generic: introduce be48 unaligned accessors nvme: allow integrity on extended metadata formats block: support pi with extended metadata
| * | nvme: add support for enhanced metadataKeith Busch2022-03-072-25/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NVM Express ratified TP 4068 defines new protection information formats. Implement support for the CRC64 guard tags. Since the block layer doesn't support variable length reference tags, driver support for the Storage Tag space is not supported at this time. Cc: Hannes Reinecke <hare@suse.de> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Klaus Jensen <its@irrelevant.dk> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220303201312.3255347-9-kbusch@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | nvme: allow integrity on extended metadata formatsKeith Busch2022-03-071-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The block integrity subsystem knows how to construct protection information buffers with metadata beyond the protection information fields. Remove the driver restriction. Note, this can only work if the PI field appears first in the metadata, as the integrity subsystem doesn't calculate guard tags on preceding metadata. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20220303201312.3255347-3-kbusch@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | Merge tag 'for-5.18/write-streams-2022-03-18' of git://git.kernel.dk/linux-blockLinus Torvalds2022-03-262-144/+0
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NVMe write streams removal from Jens Axboe: "This removes the write streams support in NVMe. No vendor ever really shipped working support for this, and they are not interested in supporting it. With the NVMe support gone, we have nothing in the tree that supports this. Remove passing around of the hints. The only discussion point in this patchset imho is the fact that the file specific write hint setting/getting fcntl helpers will now return -1/EINVAL like they did before we supported write hints. No known applications use these functions, I only know of one prototype that I help do for RocksDB, and that's not used. That said, with a change like this, it's always a bit controversial. Alternatively, we could just make them return 0 and pretend it worked. It's placement based hints after all" * tag 'for-5.18/write-streams-2022-03-18' of git://git.kernel.dk/linux-block: fs: remove fs.f_write_hint fs: remove kiocb.ki_hint block: remove the per-bio/request write hint nvme: remove support or stream based temperature hint
| * | nvme: remove support or stream based temperature hintChristoph Hellwig2022-03-072-144/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This support was added for RocksDB, but RocksDB ended up not using it. At the same time drives on the open marked (vs those build for OEMs for non-Linux support) that actually support streams are extremly rare. Don't bloat the nvme driver for it. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20220304175556.407719-1-hch@lst.de [axboe: fold in ctrl->nr_streams removal from Keith] Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | Merge branch 'for-5.18/drivers' into for-5.18/write-streamsJens Axboe2022-03-0718-152/+516
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * for-5.18/drivers: (51 commits) bcache: fixup multiple threads crash bcache: fixup bcache_dev_sectors_dirty_add() multithreaded CPU false sharing floppy: use memcpy_{to,from}_bvec drbd: use bvec_kmap_local in recv_dless_read drbd: use bvec_kmap_local in drbd_csum_bio bcache: use bvec_kmap_local in bio_csum nvdimm-btt: use bvec_kmap_local in btt_rw_integrity nvdimm-blk: use bvec_kmap_local in nd_blk_rw_integrity zram: use memcpy_from_bvec in zram_bvec_write zram: use memcpy_to_bvec in zram_bvec_read aoe: use bvec_kmap_local in bvcpy iss-simdisk: use bvec_kmap_local in simdisk_submit_bio nvme: check that EUI/GUID/UUID are globally unique nvme: check for duplicate identifiers earlier nvme: fix the check for duplicate unique identifiers nvme: cleanup __nvme_check_ids nvme: remove nssa from struct nvme_ctrl nvme: explicitly set non-error for directives nvme: expose cntrltype and dctype through sysfs nvme: send uevent on connection up ...
| * \ \ Merge branch 'for-5.18/block' into for-5.18/write-streamsJens Axboe2022-03-073-20/+19
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * for-5.18/block: (96 commits) block: remove bio_devname ext4: stop using bio_devname raid5-ppl: stop using bio_devname raid1: stop using bio_devname md-multipath: stop using bio_devname dm-integrity: stop using bio_devname dm-crypt: stop using bio_devname pktcdvd: remove a pointless debug check in pkt_submit_bio block: remove handle_bad_sector block: fix and cleanup bio_check_ro bfq: fix use-after-free in bfq_dispatch_request blk-crypto: show crypto capabilities in sysfs block: don't delete queue kobject before its children block: simplify calling convention of elv_unregister_queue() block: remove redundant semicolon block: default BLOCK_LEGACY_AUTOLOAD to y block: update io_ticks when io hang block, bfq: don't move oom_bfqq block, bfq: avoid moving bfqq to it's parent bfqg block, bfq: cleanup bfq_bfqq_to_bfqg() ...