summaryrefslogtreecommitdiffstats
path: root/drivers
Commit message (Collapse)AuthorAgeFilesLines
* nvmet: provide aen bit functions for multiple controller typesJay Sternberg2018-12-073-20/+15
| | | | | | | | | | | | Move nvmet_aen_disabled and nvmet_clear_aen in preparation for other types of controllers to use, initially the discovery controller. Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvmet: use IOCB_NOWAIT for file-ns buffered I/OChaitanya Kulkarni2018-12-071-46/+84
| | | | | | | | | | | | | This patch optimizes read command behavior when file-ns configured with buffered I/O. Instead of offloading the buffered I/O read operations to the worker threads, we first issue the read operation with IOCB_NOWAIT and try and access the data from the cache. Here we only offload the request to the worker thread and complete the request in the worker thread context when IOCB_NOWAIT request fails. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvmet: support for traffic based keep-aliveSagi Grimberg2018-12-073-1/+16
| | | | | | | | | | | | | A controller that supports traffic based keep-alive can restart the keep alive timer even when no keep-alive was not received in the kato period as long as other admin or I/O commands were received. For each command set ctrl->cmd_seen to true, and when keep-alive timer expires, if any commands were seen, resched ka_work instead of escalating to a fatal error. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme: support traffic based keep-aliveSagi Grimberg2018-12-072-0/+14
| | | | | | | | | | | | | | | If the controller supports traffic based keep alive, we restart the keep alive timer if any admin or io commands was completed during the kato period. This prevents a possible starvation of keep alive commands in the presence of heavy traffic as in such case, we already have a health indication from the host perspective. Only set a comp_seen indicator in case the controller supports keep alive to minimize the overhead for pci controllers. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme: cache controller attributesSagi Grimberg2018-12-072-0/+2
| | | | | | | | | | We get the controller attributes in identify, cache them as we'll need them for traffic based keep alive support. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme: introduce ctrl attributes enumerationSagi Grimberg2018-12-071-1/+1
| | | | | | | | | | | We are growing more controller attributes, so use a proper enumeration for it. For now just add the 128-bit hostid which we support. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme: add a numa_node field to struct nvme_ctrlHannes Reinecke2018-12-075-7/+12
| | | | | | | | | | | | | | Instead of directly poking into the struct device add a new numa_node field to struct nvme_ctrl. This allows fabrics drivers where ctrl->dev is a virtual device to support NUMA affinity as well. Also expose the field as a sysfs attribute, and populate it for the RDMA and FC transports. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme: consolidate memset calls in the nvme_setup_cmd pathChaitanya Kulkarni2018-12-071-3/+1
| | | | | | | | | | | | | | | In function nvme_setup_cmd() we call command specific setup function for flush, rw, and discard. Instead of calling memset in each function lets call it once in the parent function. This is purely code cleanup patch and it does not change any existing functionality. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* blkcg: remove bio->bi_css and instead use bio->bi_blkgDennis Zhou2018-12-072-3/+4
| | | | | | | | | | | | | | Prior patches ensured that any bio that interacts with a request_queue is properly associated with a blkg. This makes bio->bi_css unnecessary as blkg maintains a reference to blkcg already. This removes the bio field bi_css and transfers corresponding uses to access via bi_blkg. Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* dm: set the static flush bio device on demandDennis Zhou2018-12-071-1/+11
| | | | | | | | | | | | | | | | The next patch changes the macro bio_set_dev() to associate a bio with a blkg based on the device set. However, dm creates a static bio to be used as the basis for cloning empty flush bios on creation. The bio_set_dev() call in alloc_dev() will cause problems with the next patch adding association to bio_set_dev() because the call is before the bdev is associated with a gendisk (bd_disk is %NULL). To get around this, set the device on the static bio every time and use that to clone to the other bios. Signed-off-by: Dennis Zhou <dennis@kernel.org> Acked-by: Mike Snitzer <snitzer@redhat.com> Cc: Alasdair Kergon <agk@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: only allow polling if a poll queue_map existsChristoph Hellwig2018-12-041-20/+9
| | | | | | | | | This avoids having to have differnet mq_ops for different setups with or without poll queues. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme-mpath: remove I/O polling supportChristoph Hellwig2018-12-041-16/+0
| | | | | | | | | | | | | | The ->poll_fn has been stale for a while, as a lot of places check for mq ops. But there is no real point in it anyway, as we don't even use the multipath code for subsystems without multiple ports, which is usually what we do high performance I/O to. If it really becomes an issue we should rework the nvme code to also skip the multipath code for any private namespace, even if that could mean some trouble when rescanning. Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme-rdma: remove I/O polling supportChristoph Hellwig2018-12-041-24/+0
| | | | | | | | | The code was always a bit of a hack that digs far too much into RDMA core internals. Lets kick it out and reimplement proper dedicated poll queues as needed. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme-pci: remove the CQ lock for interrupt driven queuesChristoph Hellwig2018-12-041-11/+26
| | | | | | | | | | | | | | | Now that we can't poll regular, interrupt driven I/O queues there is almost nothing that can race with an interrupt. The only possible other contexts polling a CQ are the error handler and queue shutdown, and both are so far off in the slow path that we can simply use the big hammer of disabling interrupts. With that we can stop taking the cq_lock for normal queues. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme-pci: don't poll from irq context when deleting queuesChristoph Hellwig2018-12-041-8/+19
| | | | | | | | | | | | This is the last place outside of nvme_irq that handles CQEs from interrupt context, and thus is in the way of removing the cq_lock for normal queues, and avoiding lockdep warnings on the poll queues, for which we already take it without IRQ disabling. Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme-pci: refactor nvme_disable_io_queuesChristoph Hellwig2018-12-041-21/+20
| | | | | | | | | Pass the opcode for the delete SQ/CQ command as an argument instead of the somewhat confusing pass loop. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme-pci: consolidate code for polling non-dedicated queuesChristoph Hellwig2018-12-041-23/+12
| | | | | | | | | | | We have three places that can poll for I/O completions on a normal interrupt-enabled queue. All of them are in slow path code, so consolidate them to a single helper that uses spin_lock_irqsave and removes the fast path cqe_pending check. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme-pci: only allow polling with separate poll queuesChristoph Hellwig2018-12-041-13/+5
| | | | | | | | | | | This will allow us to simplify both the regular NVMe interrupt handler and the upcoming aio poll code. In addition to that the separate queues are generally a good idea for performance reasons. Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme-pci: cleanup SQ allocation a bitChristoph Hellwig2018-12-041-18/+15
| | | | | | | | | | Use a bit flag to mark if the SQ was allocated from the CMB, and clean up the surrounding code a bit. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme-pci: use atomic bitops to mark a queue enabledChristoph Hellwig2018-12-041-28/+15
| | | | | | | | | This gets rid of all the messing with cq_vector and the ->polled field by using an atomic bitop to mark the queue enabled or not. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: move queues types to the block layerChristoph Hellwig2018-12-041-43/+25
| | | | | | | | | | | | | | Having another indirect all in the fast path doesn't really help in our post-spectre world. Also having too many queue type is just going to create confusion, so I'd rather manage them centrally. Note that the queue type naming and ordering changes a bit - the first index now is the default queue for everything not explicitly marked, the optional ones are read and poll queues. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Merge tag 'v4.20-rc5' into for-4.21/blockJens Axboe2018-12-04267-1552/+2376
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull in v4.20-rc5, solving a conflict we'll otherwise get in aio.c and also getting the merge fix that went into mainline that users are hitting testing for-4.21/block and/or for-next. * tag 'v4.20-rc5': (664 commits) Linux 4.20-rc5 PCI: Fix incorrect value returned from pcie_get_speed_cap() MAINTAINERS: Update linux-mips mailing list address ocfs2: fix potential use after free mm/khugepaged: fix the xas_create_range() error path mm/khugepaged: collapse_shmem() do not crash on Compound mm/khugepaged: collapse_shmem() without freezing new_page mm/khugepaged: minor reorderings in collapse_shmem() mm/khugepaged: collapse_shmem() remember to clear holes mm/khugepaged: fix crashes due to misaccounted holes mm/khugepaged: collapse_shmem() stop if punched or truncated mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() mm/huge_memory: splitting set mapping+index before unfreeze mm/huge_memory: rename freeze_page() to unmap_page() initramfs: clean old path before creating a hardlink kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notrace psi: make disabling/enabling easier for vendor kernels proc: fixup map_files test on arm debugobjects: avoid recursive calls with kmemleak userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set ...
| * Merge tag 'armsoc-fixes' of ↵Linus Torvalds2018-12-021-1/+1
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Volume is a little higher than usual due to a set of gpio fixes for Davinci platforms that's been around a while, still seemed appropriate to not hold off until next merge window. Besides that it's the usual mix of minor fixes, mostly corrections of small stuff in device trees. Major stability-related one is the removal of a regulator from DT on Rock960, since DVFS caused undervoltage. I expect it'll be restored once they figure out the underlying issue" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (28 commits) MAINTAINERS: Remove unused Qualcomm SoC mailing list ARM: davinci: dm644x: set the GPIO base to 0 ARM: davinci: da830: set the GPIO base to 0 ARM: davinci: dm355: set the GPIO base to 0 ARM: davinci: dm646x: set the GPIO base to 0 ARM: davinci: dm365: set the GPIO base to 0 ARM: davinci: da850: set the GPIO base to 0 gpio: davinci: restore a way to manually specify the GPIO base ARM: davinci: dm644x: define gpio interrupts as separate resources ARM: davinci: dm355: define gpio interrupts as separate resources ARM: davinci: dm646x: define gpio interrupts as separate resources ARM: davinci: dm365: define gpio interrupts as separate resources ARM: davinci: da8xx: define gpio interrupts as separate resources ARM: dts: at91: sama5d2: use the divided clock for SMC ARM: dts: imx51-zii-rdu1: Remove EEPROM node ARM: dts: rockchip: Remove @0 from the veyron memory node arm64: dts: rockchip: Fix PCIe reset polarity for rk3399-puma-haikou. arm64: dts: qcom: msm8998: Reserve gpio ranges on MTP arm64: dts: sdm845-mtp: Reserve reserved gpios arm64: dts: ti: k3-am654: Fix wakeup_uart reg address ...
| | * gpio: davinci: restore a way to manually specify the GPIO baseBartosz Golaszewski2018-11-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 587f7a694f01 ("gpio: davinci: Use dev name for label and automatic base selection") broke the network support in legacy boot mode for da850-evm since we can no longer request the MDIO clock GPIO. Other boards may be broken too, which I haven't tested. The problem is in the fact that most board files still use the legacy GPIO API where lines are requested by numbers rather than descriptors. While this should be fixed eventually, in order to unbreak the board for now - provide a way to manually specify the GPIO base in platform data. Fixes: 587f7a694f01 ("gpio: davinci: Use dev name for label and automatic base selection") Cc: stable@vger.kernel.org Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sekhar Nori <nsekhar@ti.com>
| * | Merge tag 'for-linus-4.20a-rc5-tag' of ↵Linus Torvalds2018-12-023-58/+12
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - A revert of a previous commit as it is no longer necessary and has shown to cause problems in some memory hotplug cases. - Some small fixes and a minor cleanup. - A patch for adding better diagnostic data in a very rare failure case. * tag 'for-linus-4.20a-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: pvcalls-front: fixes incorrect error handling Revert "xen/balloon: Mark unallocated host memory as UNUSABLE" xen: xlate_mmu: add missing header to fix 'W=1' warning xen/x86: add diagnostic printout to xen_mc_flush() in case of error x86/xen: cleanup includes in arch/x86/xen/spinlock.c
| | * | pvcalls-front: fixes incorrect error handlingPan Bian2018-11-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kfree() is incorrectly used to release the pages allocated by __get_free_page() and __get_free_pages(). Use the matching deallocators i.e., free_page() and free_pages(), respectively. Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com>
| | * | Revert "xen/balloon: Mark unallocated host memory as UNUSABLE"Igor Druzhinin2018-11-291-56/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit b3cf8528bb21febb650a7ecbf080d0647be40b9f. That commit unintentionally broke Xen balloon memory hotplug with "hotplug_unpopulated" set to 1. As long as "System RAM" resource got assigned under a new "Unusable memory" resource in IO/Mem tree any attempt to online this memory would fail due to general kernel restrictions on having "System RAM" resources as 1st level only. The original issue that commit has tried to workaround fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") also got amended by the following 03a551734 ("x86/PCI: Move and shrink AMD 64-bit window to avoid conflict") which made the original fix to Xen ballooning unnecessary. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
| | * | xen: xlate_mmu: add missing header to fix 'W=1' warningSrikanth Boddepalli2018-11-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a missing header otherwise compiler warns about missed prototype: drivers/xen/xlate_mmu.c:183:5: warning: no previous prototype for 'xen_xlate_unmap_gfn_range?' [-Wmissing-prototypes] int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma, ^~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Srikanth Boddepalli <boddepalli.srikanth@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Joey Pabalinas <joeypabalinas@gmail.com> Signed-off-by: Juergen Gross <jgross@suse.com>
| * | | Merge tag 'dmaengine-fix-4.20-rc5' of ↵Linus Torvalds2018-12-021-1/+9
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine fixes from Vinod Koul: "This contains two fixes to at_hdmac which fixes long standing bus reported recently on serial transfers causing memory leak. These fixes were done by Richard Genoud" * tag 'dmaengine-fix-4.20-rc5' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: at_hdmac: fix module unloading dmaengine: at_hdmac: fix memory leak in at_dma_xlate()
| | * | | dmaengine: at_hdmac: fix module unloadingRichard Genoud2018-11-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of_dma_controller_free() was not called on module onloading. This lead to a soft lockup: watchdog: BUG: soft lockup - CPU#0 stuck for 23s! Modules linked in: at_hdmac [last unloaded: at_hdmac] when of_dma_request_slave_channel() tried to call ofdma->of_dma_xlate(). Cc: stable@vger.kernel.org Fixes: bbe89c8e3d59 ("at_hdmac: move to generic DMA binding") Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Richard Genoud <richard.genoud@gmail.com> Signed-off-by: Vinod Koul <vkoul@kernel.org>
| | * | | dmaengine: at_hdmac: fix memory leak in at_dma_xlate()Richard Genoud2018-11-291-1/+7
| | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The leak was found when opening/closing a serial port a great number of time, increasing kmalloc-32 in slabinfo. Each time the port was opened, dma_request_slave_channel() was called. Then, in at_dma_xlate(), atslave was allocated with devm_kzalloc() and never freed. (Well, it was free at module unload, but that's not what we want). So, here, kzalloc is more suited for the job since it has to be freed in atc_free_chan_resources(). Cc: stable@vger.kernel.org Fixes: bbe89c8e3d59 ("at_hdmac: move to generic DMA binding") Reported-by: Mario Forner <m.forner@be4energy.com> Suggested-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Richard Genoud <richard.genoud@gmail.com> Signed-off-by: Vinod Koul <vkoul@kernel.org>
| * | | Merge tag 'for-linus-20181201' of git://git.kernel.dk/linux-blockLinus Torvalds2018-12-014-4/+11
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block layer fixes from Jens Axboe: - Single range elevator discard merge fix, that caused crashes (Ming) - Fix for a regression in O_DIRECT, where we could potentially lose the error value (Maximilian Heyne) - NVMe pull request from Christoph, with little fixes all over the map for NVMe. * tag 'for-linus-20181201' of git://git.kernel.dk/linux-block: block: fix single range discard merge nvme-rdma: fix double freeing of async event data nvme: flush namespace scanning work just before removing namespaces nvme: warn when finding multi-port subsystems without multipathing enabled fs: fix lost error code in dio_complete nvme-pci: fix surprise removal nvme-fc: initialize nvme_req(rq)->ctrl after calling __nvme_fc_init_request() nvme: Free ctrl device name on init failure
| | * | | nvme-rdma: fix double freeing of async event dataPrabhath Sajeepa2018-11-301-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some error paths in configuration of admin queue free data buffer associated with async request SQE without resetting the data buffer pointer to NULL, This buffer is also freed up again if the controller is shutdown or reset. Signed-off-by: Prabhath Sajeepa <psajeepa@purestorage.com> Reviewed-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | | nvme: flush namespace scanning work just before removing namespacesSagi Grimberg2018-11-301-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nvme_stop_ctrl can be called also for reset flow and there is no need to flush the scan_work as namespaces are not being removed. This can cause deadlock in rdma, fc and loop drivers since nvme_stop_ctrl barriers before controller teardown (and specifically I/O cancellation of the scan_work itself) takes place, but the scan_work will be blocked anyways so there is no need to flush it. Instead, move scan_work flush to nvme_remove_namespaces() where it really needs to flush. Reported-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed by: James Smart <jsmart2021@gmail.com> Tested-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | | nvme: warn when finding multi-port subsystems without multipathing enabledChristoph Hellwig2018-11-301-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without CONFIG_NVME_MULTIPATH enabled a multi-port subsystem might show up as invididual devices and cause problems, warn about it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
| | * | | nvme-pci: fix surprise removalIgor Konopko2018-11-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a PCIe NVMe device is not present, nvme_dev_remove_admin() calls blk_cleanup_queue() on the admin queue, which frees the hctx for that queue. Moments later, on the same path nvme_kill_queues() calls blk_mq_unquiesce_queue() on admin queue and tries to access hctx of it, which leads to following OOPS: Oops: 0000 [#1] SMP PTI RIP: 0010:sbitmap_any_bit_set+0xb/0x40 Call Trace: blk_mq_run_hw_queue+0xd5/0x150 blk_mq_run_hw_queues+0x3a/0x50 nvme_kill_queues+0x26/0x50 nvme_remove_namespaces+0xb2/0xc0 nvme_remove+0x60/0x140 pci_device_remove+0x3b/0xb0 Fixes: cb4bfda62afa2 ("nvme-pci: fix hot removal during error handling") Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | | nvme-fc: initialize nvme_req(rq)->ctrl after calling __nvme_fc_init_request()Ewan D. Milne2018-11-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __nvme_fc_init_request() invokes memset() on the nvme_fcp_op_w_sgl structure, which NULLed-out the nvme_req(req)->ctrl field previously set by nvme_fc_init_request(). This apparently was not referenced until commit faf4a44fff ("nvme: support traffic based keep-alive") which now results in a crash in nvme_complete_rq(): [ 8386.897130] RIP: 0010:panic+0x220/0x26c [ 8386.901406] Code: 83 3d 6f ee 72 01 00 74 05 e8 e8 54 02 00 48 c7 c6 40 fd 5b b4 48 c7 c7 d8 8d c6 b3 31e [ 8386.922359] RSP: 0018:ffff99650019fc40 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 [ 8386.930804] RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000006 [ 8386.938764] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff8e325f8168b0 [ 8386.946725] RBP: ffff99650019fcb0 R08: 0000000000000000 R09: 00000000000004f8 [ 8386.954687] R10: 0000000000000000 R11: ffff99650019f9b8 R12: ffffffffb3c55f3c [ 8386.962648] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 [ 8386.970613] oops_end+0xd1/0xe0 [ 8386.974116] no_context+0x1b2/0x3c0 [ 8386.978006] do_page_fault+0x32/0x140 [ 8386.982090] page_fault+0x1e/0x30 [ 8386.985786] RIP: 0010:nvme_complete_rq+0x65/0x1d0 [nvme_core] [ 8386.992195] Code: 41 bc 03 00 00 00 74 16 0f 86 c3 00 00 00 66 3d 83 00 41 bc 06 00 00 00 0f 85 e7 00 000 [ 8387.013147] RSP: 0018:ffff99650019fe18 EFLAGS: 00010246 [ 8387.018973] RAX: 0000000000000000 RBX: ffff8e322ae51280 RCX: 0000000000000001 [ 8387.026935] RDX: 0000000000000400 RSI: 0000000000000001 RDI: ffff8e322ae51280 [ 8387.034897] RBP: ffff8e322ae51280 R08: 0000000000000000 R09: ffffffffb2f0b890 [ 8387.042859] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 8387.050821] R13: 0000000000000100 R14: 0000000000000004 R15: ffff8e2b0446d990 [ 8387.058782] ? swiotlb_unmap_page+0x40/0x40 [ 8387.063448] nvme_fc_complete_rq+0x2d/0x70 [nvme_fc] [ 8387.068986] blk_done_softirq+0xa1/0xd0 [ 8387.073264] __do_softirq+0xd6/0x2a9 [ 8387.077251] run_ksoftirqd+0x26/0x40 [ 8387.081238] smpboot_thread_fn+0x10e/0x160 [ 8387.085807] kthread+0xf8/0x130 [ 8387.089309] ? sort_range+0x20/0x20 [ 8387.093198] ? kthread_stop+0x110/0x110 [ 8387.097475] ret_from_fork+0x35/0x40 [ 8387.101462] ---[ end trace 7106b0adf5e422f8 ]--- Fixes: faf4a44fff ("nvme: support traffic based keep-alive") Signed-off-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | | nvme: Free ctrl device name on init failureKeith Busch2018-11-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Free the kobject name that was allocated for the controller device on failure rather than its parent. Fixes: d22524a4782a9 ("nvme: switch controller refcounting to use struct device") Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | | Merge tag 'pci-v4.20-fixes-2' of ↵Linus Torvalds2018-12-014-24/+13
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix a link speed checking interface that broke PCIe gen3 cards in gen1 slots (Mikulas Patocka) - Fix an imx6 link training error (Trent Piepho) - Fix a layerscape outbound window accessor calling error (Hou Zhiqiang) - Fix a DesignWare endpoint MSI-X address calculation error (Gustavo Pimentel) * tag 'pci-v4.20-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Fix incorrect value returned from pcie_get_speed_cap() PCI: dwc: Fix MSI-X EP framework address calculation bug PCI: layerscape: Fix wrong invocation of outbound window disable accessor PCI: imx6: Fix link training status detection in link up check
| | * \ \ \ Merge remote-tracking branch 'lorenzo/pci/controller-fixes' into for-linusBjorn Helgaas2018-11-303-11/+2
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Fix DesignWare endpoint MSI-X address calculation bug (Gustavo Pimentel) - Fix Layerscape outbound window disable usage (Hou Zhiqiang) - Fix imx6 link up detection (Trent Piepho) * lorenzo/pci/controller-fixes: PCI: dwc: Fix MSI-X EP framework address calculation bug PCI: layerscape: Fix wrong invocation of outbound window disable accessor PCI: imx6: Fix link training status detection in link up check
| | | * | | | PCI: dwc: Fix MSI-X EP framework address calculation bugGustavo Pimentel2018-11-271-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix an error caused by 3-bit right rotation on offset address calculation of MSI-X table in dw_pcie_ep_raise_msix_irq(). The initial testing code was setting by default the offset address of MSI-X table to zero, so that even with a 3-bit right rotation the computed result would still be zero and valid, therefore this bug went unnoticed. Fixes: beb4641a787d ("PCI: dwc: Add MSI-X callbacks handler") Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> [lorenzo.pieralisi@arm.com: updated commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: stable@vger.kernel.org
| | | * | | | PCI: layerscape: Fix wrong invocation of outbound window disable accessorHou Zhiqiang2018-11-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The order of parameters is not correct when invoking the outbound window disable routine. Fix it. Fixes: 4a2745d760fa ("PCI: layerscape: Disable outbound windows configured by bootloader") Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> [lorenzo.pieralisi@arm.com: commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: stable@vger.kernel.org
| | | * | | | PCI: imx6: Fix link training status detection in link up checkTrent Piepho2018-11-201-9/+1
| | | | |/ / | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This bug was introduced in the interaction for two commits on either branch of the merge commit 562df5c8521e ("Merge branch 'pci/host-designware' into next"). Commit 4d107d3b5a68 ("PCI: imx6: Move link up check into imx6_pcie_wait_for_link()"), changed imx6_pcie_wait_for_link() to poll the link status register directly, checking for link up and not training, and made imx6_pcie_link_up() only check the link up bit (once, not a polling loop). While commit 886bc5ceb5cc ("PCI: designware: Add generic dw_pcie_wait_for_link()"), replaced the loop in imx6_pcie_wait_for_link() with a call to a new dwc core function, which polled imx6_pcie_link_up(), which still checked both link up and not training in a loop. When these two commits were merged, the version of imx6_pcie_wait_for_link() from 886bc5ceb5cc was kept, which eliminated the link training check placed there by 4d107d3b5a68. However, the version of imx6_pcie_link_up() from 4d107d3b5a68 was kept, which eliminated the link training check that had been there and was moved to imx6_pcie_wait_for_link(). The result was the link training check got lost for the imx6 driver. Eliminate imx6_pcie_link_up() so that the default handler, dw_pcie_link_up(), is used instead. The default handler has the correct code, which checks for link up and also that it still is not training, fixing the regression. Fixes: 562df5c8521e ("Merge branch 'pci/host-designware' into next") Signed-off-by: Trent Piepho <tpiepho@impinj.com> [lorenzo.pieralisi@arm.com: rewrote the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Joao Pinto <Joao.Pinto@synopsys.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Richard Zhu <hongxing.zhu@nxp.com>
| | * | | | PCI: Fix incorrect value returned from pcie_get_speed_cap()Mikulas Patocka2018-11-301-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The macros PCI_EXP_LNKCAP_SLS_*GB are values, not bit masks. We must mask the register and compare it against them. This fixes errors like this: amdgpu: [powerplay] failed to send message 261 ret is 0 when a PCIe-v3 card is plugged into a PCIe-v1 slot, because the slot is being incorrectly reported as PCIe-v3 capable. 6cf57be0f78e, which appeared in v4.17, added pcie_get_speed_cap() with the incorrect test of PCI_EXP_LNKCAP_SLS as a bitmask. 5d9a63304032, which appeared in v4.19, changed amdgpu to use pcie_get_speed_cap(), so the amdgpu bug reports below are regressions in v4.19. Fixes: 6cf57be0f78e ("PCI: Add pcie_get_speed_cap() to find max supported link speed") Fixes: 5d9a63304032 ("drm/amdgpu: use pcie functions for link width and speed") Link: https://bugs.freedesktop.org/show_bug.cgi?id=108704 Link: https://bugs.freedesktop.org/show_bug.cgi?id=108778 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> [bhelgaas: update comment, remove use of PCI_EXP_LNKCAP_SLS_8_0GB and PCI_EXP_LNKCAP_SLS_16_0GB since those should be covered by PCI_EXP_LNKCAP2, remove test of PCI_EXP_LNKCAP for zero, since that register is required] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # v4.17+
| * | | | | Merge tag 'arm64-fixes' of ↵Linus Torvalds2018-11-301-1/+1
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Cortex-A76 erratum workaround - ftrace fix to enable syscall events on arm64 - Fix uninitialised pointer in iort_get_platform_device_domain() * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: ACPI/IORT: Fix iort_get_platform_device_domain() uninitialized pointer value arm64: ftrace: Fix to enable syscall events on arm64 arm64: Add workaround for Cortex-A76 erratum 1286807
| | * | | | | ACPI/IORT: Fix iort_get_platform_device_domain() uninitialized pointer valueLorenzo Pieralisi2018-11-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Running the Clang static analyzer on IORT code detected the following error: Logic error: Branch condition evaluates to a garbage value in iort_get_platform_device_domain() If the named component associated with a given device has no IORT mappings, iort_get_platform_device_domain() exits its MSI mapping loop with msi_parent pointer containing garbage, which can lead to erroneous code path execution. Initialize the msi_parent pointer, fixing the bug. Fixes: d4f54a186667 ("ACPI: platform: setup MSI domain for ACPI based platform device") Reported-by: Patrick Bellasi <patrick.bellasi@arm.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | | | | Merge tag 'char-misc-4.20-rc5' of ↵Linus Torvalds2018-11-308-25/+67
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fixes from Greg KH: "Here are a few small char/misc driver fixes for 4.20-rc5 that resolve a number of reported issues. The "largest" here is the thunderbolt patch, which resolves an issue with NVM upgrade, the smallest being some fsi driver fixes. There's also a hyperv bugfix, and the usual binder bugfixes. All of these have been in linux-next with no reported issues" * tag 'char-misc-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: misc: mic/scif: fix copy-paste error in scif_create_remote_lookup thunderbolt: Prevent root port runtime suspend during NVM upgrade Drivers: hv: vmbus: check the creation_status in vmbus_establish_gpadl() binder: fix race that allows malicious free of live buffer fsi: fsi-scom.c: Remove duplicate header fsi: master-ast-cf: select GENERIC_ALLOCATOR
| | * | | | | | misc: mic/scif: fix copy-paste error in scif_create_remote_lookupYueHaibing2018-11-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc '-Wunused-but-set-variable' warning: drivers/misc/mic/scif/scif_rma.c: In function 'scif_create_remote_lookup': drivers/misc/mic/scif/scif_rma.c:373:25: warning: variable 'vmalloc_num_pages' set but not used [-Wunused-but-set-variable] 'vmalloc_num_pages' should be used to determine if the address is within the vmalloc range. Fixes: ba612aa8b487 ("misc: mic: SCIF memory registration and unregistration") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| | * | | | | | thunderbolt: Prevent root port runtime suspend during NVM upgradeMika Westerberg2018-11-261-2/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During NVM upgrade process the host router is hot-removed for a short while. During this time it is possible that the root port is moved into D3cold which would be fine if the root port could trigger PME on itself. However, many systems actually do not implement it so what happens is that the root port goes into D3cold and never wakes up unless userspace does PCI config space access, such as running 'lscpi'. For this reason we explicitly prevent the root port from runtime suspending during NVM upgrade. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| | * | | | | | Drivers: hv: vmbus: check the creation_status in vmbus_establish_gpadl()Dexuan Cui2018-11-261-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a longstanding issue: if the vmbus upper-layer drivers try to consume too many GPADLs, the host may return with an error 0xC0000044 (STATUS_QUOTA_EXCEEDED), but currently we forget to check the creation_status, and hence we can pass an invalid GPADL handle into the OPEN_CHANNEL message, and get an error code 0xc0000225 in open_info->response.open_result.status, and finally we hang in vmbus_open() -> "goto error_free_info" -> vmbus_teardown_gpadl(). With this patch, we can exit gracefully on STATUS_QUOTA_EXCEEDED. Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>