summaryrefslogtreecommitdiffstats
path: root/drivers
Commit message (Collapse)AuthorAgeFilesLines
* usb: typec: ucsi: acpi: Workaround for cache mode issueHeikki Krogerus2018-06-251-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes an issue where the driver fails with an error: ioremap error for 0x3f799000-0x3f79a000, requested 0x2, got 0x0 On some platforms the UCSI ACPI mailbox SystemMemory Operation Region may be setup before the driver has been loaded. That will lead into the driver failing to map the mailbox region, as it has been already marked as write-back memory. acpi_os_ioremap() for x86 uses ioremap_cache() unconditionally. When the issue happens, the embedded controller has a pending query event for the UCSI notification right after boot-up which causes the operation region to be setup before UCSI driver has been loaded. The fix is to notify acpi core that the driver is about to access memory region which potentially overlaps with an operation region right before mapping it. acpi_release_memory() will check if the memory has already been setup (mapped) by acpi core, and deactivate it (unmap) if it has. The driver is then able to map the memory with ioremap_nocache() and set the memtype to uncached for the region. Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Fixes: 8243edf44152 ("usb: typec: ucsi: Add ACPI driver") Cc: stable@vger.kernel.org Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* acpi: Add helper for deactivating memory regionHeikki Krogerus2018-06-251-0/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Sometimes memory resource may be overlapping with SystemMemory Operation Region by design, for example if the memory region is used as a mailbox for communication with a firmware in the system. One occasion of such mailboxes is USB Type-C Connector System Software Interface (UCSI). With regions like that, it is important that the driver is able to map the memory with the requirements it has. For example, the driver should be allowed to map the memory as non-cached memory. However, if the operation region has been accessed before the driver has mapped the memory, the memory has been marked as write-back by the time the driver is loaded. That means the driver will fail to map the memory if it expects non-cached memory. To work around the problem, introducing helper that the drivers can use to temporarily deactivate (unmap) SystemMemory Operation Regions that overlap with their IO memory. Fixes: 8243edf44152 ("usb: typec: ucsi: Add ACPI driver") Cc: stable@vger.kernel.org Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* usb: xhci: increase CRS timeout valueAjay Gupta2018-06-251-1/+6
| | | | | | | | | | | | | Some controllers take almost 55ms to complete controller restore state (CRS). There is no timeout limit mentioned in xhci specification so fixing the issue by increasing the timeout limit to 100ms [reformat code comment -Mathias] Signed-off-by: Ajay Gupta <ajaykuee@gmail.com> Signed-off-by: Nagaraj Annaiah <naga.annaiah@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* usb: xhci: tegra: fix runtime PM error handlingStefan Agner2018-06-251-2/+2
| | | | | | | | | | | | | The address-of operator will always evaluate to true. However, power should be explicitly disabled if no power domain is used. Remove the address-of operator. Fixes: 58c38116c6cc ("usb: xhci: tegra: Add support for managing powergates") Signed-off-by: Stefan Agner <stefan@agner.ch> Acked-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* usb: xhci: remove the code build warningDongjiu Geng2018-06-251-1/+1
| | | | | | | | | | | | | | | | Initialize the 'err' variate to remove the build warning, the warning is shown as below: drivers/usb/host/xhci-tegra.c: In function 'tegra_xusb_mbox_thread': drivers/usb/host/xhci-tegra.c:552:6: warning: 'err' may be used uninitialized in this function [-Wuninitialized] drivers/usb/host/xhci-tegra.c:482:6: note: 'err' was declared here Fixes: e84fce0f8837 ("usb: xhci: Add NVIDIA Tegra XUSB controller driver") Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Acked-by: Thierry Reding <treding@nvidia.com> Acked-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* xhci: Fix kernel oops in trace_xhci_free_virt_deviceZhengjun Xing2018-06-252-7/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 44a182b9d177 ("xhci: Fix use-after-free in xhci_free_virt_device") set dev->udev pointer to NULL in xhci_free_dev(), it will cause kernel panic in trace_xhci_free_virt_device. This patch reimplement the trace function trace_xhci_free_virt_device, remove dev->udev dereference and added more useful parameters to show in the trace function,it also makes sure dev->udev is not NULL before calling trace_xhci_free_virt_device. This issue happened when xhci-hcd trace is enabled and USB devices hot plug test. Original use-after-free patch went to stable so this needs so be applied there as well. [ 1092.022457] usb 2-4: USB disconnect, device number 6 [ 1092.092772] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [ 1092.101694] PGD 0 P4D 0 [ 1092.104601] Oops: 0000 [#1] SMP [ 1092.207734] Workqueue: usb_hub_wq hub_event [ 1092.212507] RIP: 0010:trace_event_raw_event_xhci_log_virt_dev+0x6c/0xf0 [ 1092.220050] RSP: 0018:ffff8c252e883d28 EFLAGS: 00010086 [ 1092.226024] RAX: ffff8c24af86fa84 RBX: 0000000000000003 RCX: ffff8c25255c2a01 [ 1092.234130] RDX: 0000000000000000 RSI: 00000000aef55009 RDI: ffff8c252e883d28 [ 1092.242242] RBP: ffff8c252550e2c0 R08: ffff8c24af86fa84 R09: 0000000000000a70 [ 1092.250364] R10: 0000000000000a70 R11: 0000000000000000 R12: ffff8c251f21a000 [ 1092.258468] R13: 000000000000000c R14: ffff8c251f21a000 R15: ffff8c251f432f60 [ 1092.266572] FS: 0000000000000000(0000) GS:ffff8c252e880000(0000) knlGS:0000000000000000 [ 1092.275757] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1092.282281] CR2: 0000000000000000 CR3: 0000000154209001 CR4: 00000000003606e0 [ 1092.290384] Call Trace: [ 1092.293156] <IRQ> [ 1092.295439] xhci_free_virt_device.part.34+0x182/0x1a0 [ 1092.301288] handle_cmd_completion+0x7ac/0xfa0 [ 1092.306336] ? trace_event_raw_event_xhci_log_trb+0x6e/0xa0 [ 1092.312661] xhci_irq+0x3e8/0x1f60 [ 1092.316524] __handle_irq_event_percpu+0x75/0x180 [ 1092.321876] handle_irq_event_percpu+0x20/0x50 [ 1092.326922] handle_irq_event+0x36/0x60 [ 1092.331273] handle_edge_irq+0x6d/0x180 [ 1092.335644] handle_irq+0x16/0x20 [ 1092.339417] do_IRQ+0x41/0xc0 [ 1092.342782] common_interrupt+0xf/0xf [ 1092.346955] </IRQ> Fixes: 44a182b9d177 ("xhci: Fix use-after-free in xhci_free_virt_device") Cc: <stable@vger.kernel.org> Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* xhci: Fix perceived dead host due to runtime suspend race with event handlerMathias Nyman2018-06-252-3/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't rely on event interrupt (EINT) bit alone to detect pending port change in resume. If no change event is detected the host may be suspended again, oterwise roothubs are resumed. There is a lag in xHC setting EINT. If we don't notice the pending change in resume, and the controller is runtime suspeded again, it causes the event handler to assume host is dead as it will fail to read xHC registers once PCI puts the controller to D3 state. [ 268.520969] xhci_hcd: xhci_resume: starting port polling. [ 268.520985] xhci_hcd: xhci_hub_status_data: stopping port polling. [ 268.521030] xhci_hcd: xhci_suspend: stopping port polling. [ 268.521040] xhci_hcd: // Setting command ring address to 0x349bd001 [ 268.521139] xhci_hcd: Port Status Change Event for port 3 [ 268.521149] xhci_hcd: resume root hub [ 268.521163] xhci_hcd: port resume event for port 3 [ 268.521168] xhci_hcd: xHC is not running. [ 268.521174] xhci_hcd: handle_port_status: starting port polling. [ 268.596322] xhci_hcd: xhci_hc_died: xHCI host controller not responding, assume dead The EINT lag is described in a additional note in xhci specs 4.19.2: "Due to internal xHC scheduling and system delays, there will be a lag between a change bit being set and the Port Status Change Event that it generated being written to the Event Ring. If SW reads the PORTSC and sees a change bit set, there is no guarantee that the corresponding Port Status Change Event has already been written into the Event Ring." Cc: <stable@vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge tag 'fixes-for-v4.18-rc1' of ↵Greg Kroah-Hartman2018-06-2512-44/+166
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus usb: fixes for v4.18-rc1 First set of fixes for the current -rc cycle. The main parts being warnings of different kinds being fixed. We're also adding support for Intel'l Icelake devices on dwc3-pci.c.
| * dwc2: gadget: Fix ISOC IN DDMA PID bitfield value calculationMinas Harutyunyan2018-06-191-1/+6
| | | | | | | | | | | | | | | | | | PID bitfield in descriptor should be set based on particular request length, not based on EP's mc value. PID value can't be set to 0 even request length is 0. Signed-off-by: Minas Harutyunyan <hminas@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: gadget: dwc2: fix memory leak in gadget_init()Grigor Tovmasyan2018-06-191-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | Freed allocated request for ep0 to prevent memory leak in case when dwc2_driver_probe() failed. Cc: Stefan Wahren <stefan.wahren@i2se.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Minas Harutyunyan <hminas@synopsys.com> Signed-off-by: Grigor Tovmasyan <tovmasya@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: gadget: composite: fix delayed_status race condition when set_interfaceChunfeng Yun2018-06-191-0/+3
| | | | | | | | | | | | | | | | | | | | | | It happens when enable debug log, if set_alt() returns USB_GADGET_DELAYED_STATUS and usb_composite_setup_continue() is called before increasing count of @delayed_status, so fix it by using spinlock of @cdev->lock. Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Tested-by: Jay Hsu <shih-chieh.hsu@mediatek.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: dwc2: fix isoc split in transfer with no dataWilliam Wu2018-06-191-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If isoc split in transfer with no data (the length of DATA0 packet is zero), we can't simply return immediately. Because the DATA0 can be the first transaction or the second transaction for the isoc split in transaction. If the DATA0 packet with no data is in the first transaction, we can return immediately. But if the DATA0 packet with no data is in the second transaction of isoc split in transaction sequence, we need to increase the qtd->isoc_frame_index and giveback urb to device driver if needed, otherwise, the MDATA packet will be lost. A typical test case is that connect the dwc2 controller with an usb hs Hub (GL852G-12), and plug an usb fs audio device (Plantronics headset) into the downstream port of Hub. Then use the usb mic to record, we can find noise when playback. In the case, the isoc split in transaction sequence like this: - SSPLIT IN transaction - CSPLIT IN transaction - MDATA packet (176 bytes) - CSPLIT IN transaction - DATA0 packet (0 byte) This patch use both the length of DATA0 and qtd->isoc_split_offset to check if the DATA0 is in the second transaction. Tested-by: Gevorg Sahakyan <sahakyan@synopsys.com> Tested-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Minas Harutyunyan hminas@synopsys.com> Signed-off-by: William Wu <william.wu@rock-chips.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: dwc2: alloc dma aligned buffer for isoc split inWilliam Wu2018-06-195-5/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit 3bc04e28a030 ("usb: dwc2: host: Get aligned DMA in a more supported way") rips out a lot of code to simply the allocation of aligned DMA. However, it also introduces a new issue when use isoc split in transfer. In my test case, I connect the dwc2 controller with an usb hs Hub (GL852G-12), and plug an usb fs audio device (Plantronics headset) into the downstream port of Hub. Then use the usb mic to record, we can find noise when playback. It's because that the usb Hub uses an MDATA for the first transaction and a DATA0 for the second transaction for the isoc split in transaction. An typical isoc split in transaction sequence like this: - SSPLIT IN transaction - CSPLIT IN transaction - MDATA packet - CSPLIT IN transaction - DATA0 packet The DMA address of MDATA (urb->dma) is always DWORD-aligned, but the DMA address of DATA0 (urb->dma + qtd->isoc_split_offset) may not be DWORD-aligned, it depends on the qtd->isoc_split_offset (the length of MDATA). In my test case, the length of MDATA is usually unaligned, this cause DATA0 packet transmission error. This patch use kmem_cache to allocate aligned DMA buf for isoc split in transaction. Note that according to usb 2.0 spec, the maximum data payload size is 1023 bytes for each fs isoc ep, and the maximum allowable interrupt data payload size is 64 bytes or less for fs interrupt ep. So we set the size of object to be 1024 bytes in the kmem cache. Tested-by: Gevorg Sahakyan <sahakyan@synopsys.com> Tested-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Minas Harutyunyan hminas@synopsys.com> Signed-off-by: William Wu <william.wu@rock-chips.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: dwc2: fix the incorrect bitmaps for the ports of multi_tt hubWilliam Wu2018-06-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dwc2_get_ls_map() use ttport to reference into the bitmap if we're on a multi_tt hub. But the bitmaps index from 0 to (hub->maxchild - 1), while the ttport index from 1 to hub->maxchild. This will cause invalid memory access when the number of ttport is hub->maxchild. Without this patch, I can easily meet a Kernel panic issue if connect a low-speed USB mouse with the max port of FE2.1 multi-tt hub (1a40:0201) on rk3288 platform. Fixes: 9f9f09b048f5 ("usb: dwc2: host: Totally redo the microframe scheduler") Cc: <stable@vger.kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Acked-by: Minas Harutyunyan hminas@synopsys.com> Signed-off-by: William Wu <william.wu@rock-chips.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: dwc2: Fix host exit from hibernation flow.Artur Petrosyan2018-06-191-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case when a hub is connected to DWC2 host auto suspend occurs and host goes to hibernation. When any device connected to hub host hibernation exiting incorrectly. - Added dwc2_hcd_rem_wakeup() function call to exit from suspend state by remote wakeup. - Increase timeout value for port suspend bit to be set. Acked-by: Minas Harutyunyan <hminas@synopsys.com> Signed-off-by: Artur Petrosyan <arturp@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: dwc3: qcom: mark PM functions as __maybe_unusedArnd Bergmann2018-06-191-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The #ifdef guards around these are wrong, resulting in warnings in certain configurations: drivers/usb/dwc3/dwc3-qcom.c:244:12: error: 'dwc3_qcom_resume' defined but not used [-Werror=unused-function] static int dwc3_qcom_resume(struct dwc3_qcom *qcom) ^~~~~~~~~~~~~~~~ drivers/usb/dwc3/dwc3-qcom.c:223:12: error: 'dwc3_qcom_suspend' defined but not used [-Werror=unused-function] static int dwc3_qcom_suspend(struct dwc3_qcom *qcom) This replaces the guards with __maybe_unused annotations to shut up the warnings and give better compile time coverage. Fixes: a4333c3a6ba9 ("usb: dwc3: Add Qualcomm DWC3 glue driver") Reviewed-by: Douglas Anderson <dianders@chromium.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: dwc3: Fix error return code in dwc3_qcom_probe()Wei Yongjun2018-06-191-0/+1
| | | | | | | | | | | | | | | | | | Fix to return error code -ENODEV from the get device failed error handling case instead of 0, as done elsewhere in this function. Fixes: a4333c3a6ba9 ("usb: dwc3: Add Qualcomm DWC3 glue driver") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: dwc2: gadget: fix packet drop issue for ISOC OUT transfersMinas Harutyunyan2018-06-181-2/+0
| | | | | | | | | | | | | | | | | | In ISOC OUT transfer, when the OUT token received while EP disabled, we shouldn't complete a usb request. The current flow completed one usb request, this will lead to a packet drop to function driver. Signed-off-by: Minas Harutyunyan <hminas@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: dwc3: Only call clk_bulk_get() on devicetree instantiated devicesHans de Goede2018-06-181-10/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit fe8abf332b8f ("usb: dwc3: support clocks and resets for DWC3 core") adds support for handling clocks and resets in the DWC3 core, so that for platforms following the standard devicetree bindings this does not need to be duplicated in all the different glue layers. These changes intended for devicetree based platforms introduce an uncoditional clk_bulk_get() in the core probe path. This leads to the following error being logged on x86/ACPI systems: [ 26.276783] dwc3 dwc3.3.auto: Failed to get clk 'ref': -2 This commits wraps the clk_bulk_get() in an if (dev->of_node) check so that it only is done on devicetree instantiated devices, fixing this error. Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: dwc2: gadget: fix packet drop issue in dwc2_gadget_handle_nakZeng Tao2018-06-181-3/+0
| | | | | | | | | | | | | | | | | | | | In ISOC transfer, when the NAK interrupt happens, we shouldn't complete a usb request, the current flow will complete one usb request with no hardware transfer, this will lead to a packet drop on the usb bus. Acked-by: Minas Harutyunyan <hminas@synopsys.com> Signed-off-by: Zeng Tao <prime.zeng@hisilicon.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: dwc3: of-simple: fix use-after-free on removeJohan Hovold2018-06-181-1/+2
| | | | | | | | | | | | | | | | | | | | The clocks have already been explicitly disabled and put as part of remove() so the runtime suspend callback must not be run when balancing the runtime PM usage count before returning. Fixes: 16adc674d0d6 ("usb: dwc3: add generic OF glue layer") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: dwc2: gadget: Fix issue in dwc2_gadget_start_isoc()Minas Harutyunyan2018-06-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | In case of requests queue is empty reset EP target_frame to initial value. This allow restarting ISOC traffic in case when function driver queued requests with interruptions. Tested-by: Zeng Tao <prime.zeng@hisilicon.com> Signed-off-by: Minas Harutyunyan <hminas@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: gadget: ffs: Fix BUG when userland exits with submitted AIO transfersVincent Pelletier2018-06-181-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This bug happens only when the UDC needs to sleep during usb_ep_dequeue, as is the case for (at least) dwc3. [ 382.200896] BUG: scheduling while atomic: screen/1808/0x00000100 [ 382.207124] 4 locks held by screen/1808: [ 382.211266] #0: (rcu_callback){....}, at: [<c10b4ff0>] rcu_process_callbacks+0x260/0x440 [ 382.219949] #1: (rcu_read_lock_sched){....}, at: [<c1358ba0>] percpu_ref_switch_to_atomic_rcu+0xb0/0x130 [ 382.230034] #2: (&(&ctx->ctx_lock)->rlock){....}, at: [<c11f0c73>] free_ioctx_users+0x23/0xd0 [ 382.230096] #3: (&(&ffs->eps_lock)->rlock){....}, at: [<f81e7710>] ffs_aio_cancel+0x20/0x60 [usb_f_fs] [ 382.230160] Modules linked in: usb_f_fs libcomposite configfs bnep btsdio bluetooth ecdh_generic brcmfmac brcmutil intel_powerclamp coretemp dwc3 kvm_intel ulpi udc_core kvm irqbypass crc32_pclmul crc32c_intel pcbc dwc3_pci aesni_intel aes_i586 crypto_simd cryptd ehci_pci ehci_hcd gpio_keys usbcore basincove_gpadc industrialio usb_common [ 382.230407] CPU: 1 PID: 1808 Comm: screen Not tainted 4.14.0-edison+ #117 [ 382.230416] Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48 [ 382.230425] Call Trace: [ 382.230438] <SOFTIRQ> [ 382.230466] dump_stack+0x47/0x62 [ 382.230498] __schedule_bug+0x61/0x80 [ 382.230522] __schedule+0x43/0x7a0 [ 382.230587] schedule+0x5f/0x70 [ 382.230625] dwc3_gadget_ep_dequeue+0x14c/0x270 [dwc3] [ 382.230669] ? do_wait_intr_irq+0x70/0x70 [ 382.230724] usb_ep_dequeue+0x19/0x90 [udc_core] [ 382.230770] ffs_aio_cancel+0x37/0x60 [usb_f_fs] [ 382.230798] kiocb_cancel+0x31/0x40 [ 382.230822] free_ioctx_users+0x4d/0xd0 [ 382.230858] percpu_ref_switch_to_atomic_rcu+0x10a/0x130 [ 382.230881] ? percpu_ref_exit+0x40/0x40 [ 382.230904] rcu_process_callbacks+0x2b3/0x440 [ 382.230965] __do_softirq+0xf8/0x26b [ 382.231011] ? __softirqentry_text_start+0x8/0x8 [ 382.231033] do_softirq_own_stack+0x22/0x30 [ 382.231042] </SOFTIRQ> [ 382.231071] irq_exit+0x45/0xc0 [ 382.231089] smp_apic_timer_interrupt+0x13c/0x150 [ 382.231118] apic_timer_interrupt+0x35/0x3c [ 382.231132] EIP: __copy_user_ll+0xe2/0xf0 [ 382.231142] EFLAGS: 00210293 CPU: 1 [ 382.231154] EAX: bfd4508c EBX: 00000004 ECX: 00000003 EDX: f3d8fe50 [ 382.231165] ESI: f3d8fe51 EDI: bfd4508d EBP: f3d8fe14 ESP: f3d8fe08 [ 382.231176] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 382.231265] core_sys_select+0x25f/0x320 [ 382.231346] ? __wake_up_common_lock+0x62/0x80 [ 382.231399] ? tty_ldisc_deref+0x13/0x20 [ 382.231438] ? ldsem_up_read+0x1b/0x40 [ 382.231459] ? tty_ldisc_deref+0x13/0x20 [ 382.231479] ? tty_write+0x29f/0x2e0 [ 382.231514] ? n_tty_ioctl+0xe0/0xe0 [ 382.231541] ? tty_write_unlock+0x30/0x30 [ 382.231566] ? __vfs_write+0x22/0x110 [ 382.231604] ? security_file_permission+0x2f/0xd0 [ 382.231635] ? rw_verify_area+0xac/0x120 [ 382.231677] ? vfs_write+0x103/0x180 [ 382.231711] SyS_select+0x87/0xc0 [ 382.231739] ? SyS_write+0x42/0x90 [ 382.231781] do_fast_syscall_32+0xd6/0x1a0 [ 382.231836] entry_SYSENTER_32+0x47/0x71 [ 382.231848] EIP: 0xb7f75b05 [ 382.231857] EFLAGS: 00000246 CPU: 1 [ 382.231868] EAX: ffffffda EBX: 00000400 ECX: bfd4508c EDX: bfd4510c [ 382.231878] ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: bfd45020 [ 382.231889] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [ 382.232281] softirq: huh, entered softirq 9 RCU c10b4d90 with preempt_count 00000100, exited with 00000000? Tested-by: Sam Protsenko <semen.protsenko@linaro.org> Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
| * usb: dwc3: pci: add support for Intel IceLakeHeikki Krogerus2018-06-181-0/+2
| | | | | | | | | | | | | | PCI IDs for Intel IceLake. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
* | Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds2018-06-241-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Thomas Gleixner: "Two fixlets for the EFI maze: - Properly zero variables to prevent an early boot hang on EFI mixed mode systems - Fix the fallout of merging the 32bit and 64bit variants of EFI PCI related code which ended up chosing the 32bit variant of the actual EFi call invocation which leads to failures on 64bit" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/x86: Fix incorrect invocation of PciIo->Attributes() efi/libstub/tpm: Initialize efi_physical_addr_t vars to zero for mixed mode
| * | efi/libstub/tpm: Initialize efi_physical_addr_t vars to zero for mixed modeHans de Goede2018-06-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit: 79832f0b5f71 ("efi/libstub/tpm: Initialize pointer variables to zero for mixed mode") fixes a problem with the tpm code on mixed mode (64-bit kernel on 32-bit UEFI), where 64-bit pointer variables are not fully initialized by the 32-bit EFI code. A similar problem applies to the efi_physical_addr_t variables which are written by the ->get_event_log() EFI call. Even though efi_physical_addr_t is 64-bit everywhere, it seems that some 32-bit UEFI implementations only fill in the lower 32 bits when passed a pointer to an efi_physical_addr_t to fill. This commit initializes these to 0 to, to ensure the upper 32 bits are 0 in mixed mode. This fixes recent kernels sometimes hanging during early boot on mixed mode UEFI systems. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: <stable@vger.kernel.org> # v4.16+ Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20180622064222.11633-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds2018-06-241-1/+3
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A small set of fixes for time(r) related issues: - Fix a long standing conversion issue in jiffies_to_msecs() for odd HZ values like 1024 or 1200 which resulted in returning 0 for small jiffies values due to rounding down. - Use the proper CONFIG symbol in the new Y2038 safe compat code for posix-timers. Not yet a visible breakage, but this will immediately trigger when the architecture support for the new interfaces is merged. - Return an error code in the STM32 clocksource driver on failure instead of success. - Remove the redundant and stale irq disabled check in the posix cpu timer code. The check is at the wrong place anyway and lockdep already covers it via the sighand lock locking coverage" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: time: Make sure jiffies_to_msecs() preserves non-zero time periods posix-timers: Fix nanosleep_copyout() for CONFIG_COMPAT_32BIT_TIME clocksource/drivers/stm32: Fix error return code posix-cpu-timers: Remove lockdep_assert_irqs_disabled()
| * | | clocksource/drivers/stm32: Fix error return codeJulia Lawall2018-06-121-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return an error code on failure. Problem found using Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: kernel-janitors@vger.kernel.org Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lkml.kernel.org/r1528640655-18948-3-git-send-email-Julia.Lawall@lip6.fr
* | | | Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds2018-06-243-16/+58
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A set of fixes mostly for the ARM/GIC world: - Fix the MSI affinity handling in the ls-scfg irq chip driver so it updates and uses the effective affinity mask correctly - Prevent binding LPIs to offline CPUs and respect the Cavium erratum which requires that LPIs which belong to an offline NUMA node are not bound to a CPU on a different NUMA node. - Free only the amount of allocated interrupts in the GIC-V2M driver instead of trying to free log2(nrirqs). - Prevent emitting SYNC and VSYNC targetting non existing interrupt collections in the GIC-V3 ITS driver - Ensure that the GIV-V3 interrupt redistributor is correctly reprogrammed on CPU hotplug - Remove a stale unused helper function" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqdesc: Delete irq_desc_get_msi_desc() irqchip/gic-v3-its: Fix reprogramming of redistributors on CPU hotplug irqchip/gic-v3-its: Only emit VSYNC if targetting a valid collection irqchip/gic-v3-its: Only emit SYNC if targetting a valid collection irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA node irqchip/gic-v2m: Fix SPI release on error path irqchip/ls-scfg-msi: Fix MSI affinity handling genirq/debugfs: Add missing IRQCHIP_SUPPORTS_LEVEL_MSI debug
| * | | | irqchip/gic-v3-its: Fix reprogramming of redistributors on CPU hotplugMarc Zyngier2018-06-221-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enabling LPIs was made a lot stricter recently, by checking that they are disabled before enabling them. By doing so, the CPU hotplug case was missed altogether, which leaves LPIs enabled on hotplug off (expecting the CPU to eventually come back), and won't write a different value anyway on hotplug on. So skip that check if that particular case is detected Fixes: 6eb486b66a30 ("irqchip/gic-v3: Ensure GICR_CTLR.EnableLPI=0 is observed before enabling") Reported-by: Sumit Garg <sumit.garg@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sumit Garg <sumit.garg@linaro.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Link: https://lkml.kernel.org/r/20180622095254.5906-8-marc.zyngier@arm.com
| * | | | irqchip/gic-v3-its: Only emit VSYNC if targetting a valid collectionMarc Zyngier2018-06-221-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similarily to the SYNC operation, it must be verified that the VPE targetted by a VLPI is backed by a valid collection in the GIC driver data structures. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: Sumit Garg <sumit.garg@linaro.org> Link: https://lkml.kernel.org/r/20180622095254.5906-7-marc.zyngier@arm.com
| * | | | irqchip/gic-v3-its: Only emit SYNC if targetting a valid collectionMarc Zyngier2018-06-221-6/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible, under obscure circumstances, to convince the ITS driver to emit a SYNC operation that targets a collection that is not bound to any redistributor (and the target_address field is zero) because the corresponding CPU has not been seen yet (the system has been booted with max_cpus="something small"). If the ITS is using the linear CPU number as the target, this is not a big deal, as we just end-up issuing a SYNC to CPU0. But if the ITS requires the physical address of the redistributor (with GITS_TYPER.PTA==1), we end-up asking the ITS to write to the physical address zero, which is not exactly a good idea (there has been report of the ITS locking up). This should of course never happen, but hey, this is SW... In order to avoid the above disaster, let's track which collections have been actually initialized, and let's not generate a SYNC if the collection hasn't been properly bound to a redistributor. Take this opportunity to spit our a warning, in the hope that someone may report the issue if it arrises again. Reported-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Sumit Garg <sumit.garg@linaro.org> Link: https://lkml.kernel.org/r/20180622095254.5906-6-marc.zyngier@arm.com
| * | | | irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA nodeYang Yingliang2018-06-221-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a NUMA system, if an ITS is local to an offline node, the ITS driver may pick an offline CPU to bind the LPI. In this case, pick an online CPU (and the first one will do). But on some systems, binding an LPI to non-local node CPU may cause deadlock (see Cavium erratum 23144). In this case, just fail the activate and return an error code. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Sumit Garg <sumit.garg@linaro.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180622095254.5906-5-marc.zyngier@arm.com
| * | | | irqchip/gic-v2m: Fix SPI release on error pathMarc Zyngier2018-06-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On failing to allocate the required SPIs, the actual number of interrupts should be freed and not its log2 value. Fixes: de337ee30142 ("irqchip/gic-v2m: Add PCI Multi-MSI support") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: Sumit Garg <sumit.garg@linaro.org> Link: https://lkml.kernel.org/r/20180622095254.5906-4-marc.zyngier@arm.com
| * | | | irqchip/ls-scfg-msi: Fix MSI affinity handlingMarc Zyngier2018-06-221-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ls-scfs-msi driver is not dealing with the effective affinity as it should. Let's fix that, and make it clear that the effective affinity is restricted to a single CPU. Also prevent the driver from messing with the internals of the affinity setting infrastructure. Reported-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: Sumit Garg <sumit.garg@linaro.org> Link: https://lkml.kernel.org/r/20180622095254.5906-3-marc.zyngier@arm.com
* | | | | Merge tag 'for-linus-20180623' of git://git.kernel.dk/linux-blockLinus Torvalds2018-06-2410-56/+125
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block fixes from Jens Axboe: - Further timeout fixes. We aren't quite there yet, so expect another round of fixes for that to completely close some of the IRQ vs completion races. (Christoph/Bart) - Set of NVMe fixes from the usual suspects, mostly error handling - Two off-by-one fixes (Dan) - Another bdi race fix (Jan) - Fix nbd reconfigure with NBD_DISCONNECT_ON_CLOSE (Doron) * tag 'for-linus-20180623' of git://git.kernel.dk/linux-block: blk-mq: Fix timeout handling in case the timeout handler returns BLK_EH_DONE bdi: Fix another oops in wb_workfn() lightnvm: Remove depends on HAS_DMA in case of platform dependency nvme-pci: limit max IO size and segments to avoid high order allocations nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl nvme-fc: release io queues to allow fast fail nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag. block: sed-opal: Fix a couple off by one bugs blk-mq-debugfs: Off by one in blk_mq_rq_state_name() nvmet: reset keep alive timer in controller enable nvme-rdma: don't override opts->queue_size nvme-rdma: Fix command completion race at error recovery nvme-rdma: fix possible free of a non-allocated async event buffer nvme-rdma: fix possible double free condition when failing to create a controller Revert "block: Add warning for bi_next not NULL in bio_endio()" block: fix timeout changes for legacy request drivers
| * | | | | lightnvm: Remove depends on HAS_DMA in case of platform dependencyGeert Uytterhoeven2018-06-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove dependencies on HAS_DMA where a Kconfig symbol depends on another symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST". In most cases this other symbol is an architecture or platform specific symbol, or PCI. Generic symbols and drivers without platform dependencies keep their dependencies on HAS_DMA, to prevent compiling subsystems or drivers that cannot work anyway. This simplifies the dependencies, and allows to improve compile-testing. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | | | Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-linusJens Axboe2018-06-226-45/+88
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NVMe fixes from Christoph: "Various relatively small fixes, mostly to fix error handling of various sorts." * 'nvme-4.18' of git://git.infradead.org/nvme: nvme-pci: limit max IO size and segments to avoid high order allocations nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl nvme-fc: release io queues to allow fast fail nvmet: reset keep alive timer in controller enable nvme-rdma: don't override opts->queue_size nvme-rdma: Fix command completion race at error recovery nvme-rdma: fix possible free of a non-allocated async event buffer nvme-rdma: fix possible double free condition when failing to create a controller
| | * | | | | nvme-pci: limit max IO size and segments to avoid high order allocationsJens Axboe2018-06-213-5/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nvme requires an sg table allocation for each request. If the request is large, then the allocation can become quite large. For instance, with our default software settings of 1280KB IO size, we'll need 10248 bytes of sg table. That turns into a 2nd order allocation, which we can't always guarantee. If we fail the allocation, blk-mq will retry it later. But there's no guarantee that we'll EVER be able to allocate that much contigious memory. Limit the IO size such that we never need more than a single page of memory. That's a lot faster and more reliable. Then back that allocation with a mempool, so that we know we'll always be able to succeed the allocation at some point. Signed-off-by: Jens Axboe <axboe@kernel.dk> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | | | | nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrlJianchao Wang2018-06-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is race between nvme_remove and nvme_reset_work that can lead to io hang. nvme_remove nvme_reset_work -> nvme_remove_dead_ctrl -> nvme_dev_disable -> quiesce request_queue -> queue remove_work -> cancel_work_sync reset_work -> nvme_remove_namespaces -> splice ctrl->namespaces nvme_remove_dead_ctrl_work -> nvme_kill_queues -> nvme_ns_remove do nothing -> blk_cleanup_queue -> blk_freeze_queue Finally, the request_queue is quiesced state when wait freeze, we will get io hang here. To fix it, move the nvme_kill_queues from nvme_remove_dead_ctrl_work to nvme_remove_dead_ctrl. Suggested-by: Keith Busch <keith.busch@linux.intel.com> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | | | | nvme-fc: release io queues to allow fast failJames Smart2018-06-211-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than leaving io queues quiesced after tearing down an association, restart them. This allows ios to be replayed, with fastfail ios terminating and non-fastfail getting into loops of retry. This follows rdma's lead. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Sagi Grimberg <sagi@grimber.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | | | | nvmet: reset keep alive timer in controller enableMax Gurtuvoy2018-06-201-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Controllers that are not yet enabled should not really enforce keep alive timeouts, but we still want to track a timeout and cleanup in case a host died before it enabled the controller. Hence, simply reset the keep alive timer when the controller is enabled. Suggested-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | | | | nvme-rdma: don't override opts->queue_sizeSagi Grimberg2018-06-201-11/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | That is user argument, and theoretically controller limits can change over time (over reconnects/resets). Instead, use the sqsize controller attribute to check queue depth boundaries and use it to the tagset allocation. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | | | | nvme-rdma: Fix command completion race at error recoveryIsrael Rukshin2018-06-201-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The race is between completing the request at error recovery work and rdma completions. If we cancel the request before getting the good rdma completion we get a NULL deref of the request MR at nvme_rdma_process_nvme_rsp(). When Canceling the request we return its mr to the mr pool (set mr to NULL) and also unmap its data. Canceling the requests while the rdma queues are active is not safe. Because rdma queues are active and we get good rdma completions that can use the mr pointer which may be NULL. Completing the request too soon may lead also to performing DMA to/from user buffers which might have been already unmapped. The commit fixes the race by draining the QP before starting the abort commands mechanism. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | | | | nvme-rdma: fix possible free of a non-allocated async event bufferSagi Grimberg2018-06-201-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If nvme_rdma_configure_admin_queue fails before we allocated the async event buffer, we will falsly free it because nvme_rdma_free_queue is freeing it. Fix it by allocating the buffer right after nvme_rdma_alloc_queue and free it right before nvme_rdma_queue_free to maintain orderly reverse cleanup sequence. Reported-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | | | | nvme-rdma: fix possible double free condition when failing to create a ↵Sagi Grimberg2018-06-201-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | controller Failures after nvme_init_ctrl will defer resource cleanups to .free_ctrl when the reference is released, hence we should not free the controller queues for these failures. Fix that by moving controller queues allocation before controller initialization and correctly freeing them for failures before initialization and skip them for failures after initialization. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | | | | nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag.Doron Roberts-Kedes2018-06-201-8/+34
| |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If NBD_DISCONNECT_ON_CLOSE is set on a device, then the driver will issue a disconnect from nbd_release if the device has no remaining bdev->bd_openers. Fix ret val so reconfigure with only setting the flag succeeds. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | | | block: fix timeout changes for legacy request driversChristoph Hellwig2018-06-192-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blk_mq_complete_request can only be called for blk-mq drivers, but when removing the BLK_EH_HANDLED return value, two legacy request timeout methods incorrectly got switched to call blk_mq_complete_request. Call __blk_complete_request instead to reinstance the previous behavior. For that __blk_complete_request needs to be exported. Fixes: 1fc2b62e ("scsi_transport_fc: complete requests from ->timeout") Fixes: 0df0bb08 ("null_blk: complete requests from ->timeout") Reported-by: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | | | | Merge branch 'linus' of ↵Linus Torvalds2018-06-242-5/+11
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Fix use after free in chtls - Fix RBP breakage in sha3 - Fix use after free in hwrng_unregister - Fix overread in morus640 - Move sleep out of kernel_neon in arm64/aes-blk * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: hwrng: core - Always drop the RNG in hwrng_unregister() crypto: morus640 - Fix out-of-bounds access crypto: don't optimize keccakf() crypto: arm64/aes-blk - fix and move skcipher_walk_done out of kernel_neon_begin, _end crypto: chtls - use after free in chtls_pt_recvmsg()
| * | | | | | hwrng: core - Always drop the RNG in hwrng_unregister()Michael Büsch2018-06-151-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enable_best_rng() is used in hwrng_unregister() to switch away from the currently active RNG, if that is the one currently being removed. However enable_best_rng() might fail, if the next RNG's init routine fails. In that case enable_best_rng() will return an error code and the currently active RNG will remain active. After unregistering this might lead to crashes due to use-after-free. Fix this by dropping the currently active RNG, if enable_best_rng() failed. This will result in no RNG to be active, if the next-best one failed to initialize. This problem was introduced by 142a27f0a731ddcf467546960a5585970ca98e21 Fixes: 142a27f0a731 ("hwrng: core - Reset user selected rng by...") Reported-by: Wirz <spam@lukas-wirz.de> Tested-by: Wirz <spam@lukas-wirz.de> Signed-off-by: Michael Büsch <m@bues.ch> Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>