summaryrefslogtreecommitdiffstats
path: root/drivers/accel
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'drm-next-2024-01-10' of git://anongit.freedesktop.org/drm/drmLinus Torvalds2024-01-1247-1177/+2175
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm updates from Dave Airlie: "This contains two major new drivers: - imagination is a first driver for Imagination Technologies devices, it only covers very specific devices, but there is hope to grow it - xe is a reboot of the i915 GPU (shares display) side using a more upstream focused development model, and trying to maximise code sharing. It's not enabled for any hw by default, and will hopefully get switched on for Intel's Lunarlake. This also drops a bunch of the old UMS ioctls. It's been dead long enough. amdgpu has a bunch of new color management code that is being used in the Steam Deck. amdgpu also has a new ACPI WBRF interaction to help avoid radio interference. Otherwise it's the usual lots of changes in lots of places. Detailed summary: new drivers: - imagination - new driver for Imagination Technologies GPU - xe - new driver for Intel GPUs using core drm concepts core: - add CLOSE_FB ioctl - remove old UMS ioctls - increase max objects to accomodate AMD color mgmt encoder: - create per-encoder debugfs directory edid: - split out drm_eld - SAD helpers - drop edid_firmware module parameter format-helper: - cache format conversion buffers sched: - move from kthread to workqueue - rename some internals - implement dynamic job-flow control gpuvm: - provide more features to handle GEM objects client: - don't acquire module reference displayport: - add mst path property documentation fdinfo: - alignment fix dma-buf: - add fence timestamp helper - add fence deadline support bridge: - transparent aux-bridge for DP/USB-C - lt8912b: add suspend/resume support and power regulator support panel: - edp: AUO B116XTN02, BOE NT116WHM-N21,836X2, NV116WHM-N49 - chromebook panel support - elida-kd35t133: rework pm - powkiddy RK2023 panel - himax-hx8394: drop prepare/unprepare and shutdown logic - BOE BP101WX1-100, Powkiddy X55, Ampire AM8001280G - Evervision VGG644804, SDC ATNA45AF01 - nv3052c: register docs, init sequence fixes, fascontek FS035VG158 - st7701: Anbernic RG-ARC support - r63353 panel controller - Ilitek ILI9805 panel controller - AUO G156HAN04.0 simplefb: - support memory regions - support power domains amdgpu: - add new 64-bit sequence number infrastructure - add AMD specific color management - ACPI WBRF support for RF interference handling - GPUVM updates - RAS updates - DCN 3.5 updates - Rework PCIe link speed handling - Document GPU reset types - DMUB fixes - eDP fixes - NBIO 7.9/7.11 updates - SubVP updates - XGMI PCIe state dumping for aqua vanjaram - GFX11 golden register updates - enable tunnelling on high pri compute amdkfd: - Migrate TLB flushing logic to amdgpu - Trap handler fixes - Fix restore workers handling on suspend/resume - Fix possible memory leak in pqm_uninit() - support import/export of dma-bufs using GEM handles radeon: - fix possible overflows in command buffer checking - check for errors in ring_lock i915: - reorg display code for reuse in xe driver - fdinfo memory stats printing - DP MST bandwidth mgmt improvements - DP panel replay enabling - MTL C20 phy state verification - MTL DP DSC fractional bpp support - Audio fastset support - use dma_fence interfaces instead of i915_sw_fence - Separate gem and display code - AUX register macro refactoring - Separate display module/device parameters - Move display capabilities debugfs under display - Makefile cleanups - Register cleanups - Move display lock inits under display/ - VLV/CHV DPIO PHY register and interface refactoring - DSI VBT sequence refactoring - C10/C20 PHY PLL hardware readout - DPLL code cleanups - Cleanup PXP plane protection checks - Improve display debug msgs - PSR selective fetch fixes/improvements - DP MST fixes - Xe2LPD FBC restrictions removed - DGFX uses direct VBT pin mapping - more MTL WAs - fix MTL eDP bug - eliminate use of kmap_atomic habanalabs: - sysfs entry to identify a device minor id with debugfs path - sysfs entry to expose device module id - add signed device info retrieval through INFO ioctl - add Gaudi2C device support - pcie reset prepare/done hooks msm: - Add support for SDM670, SM8650 - Handle the CFG interconnect to fix the obscure hangs / timeouts - Kconfig fix for QMP dependency - use managed allocators - DPU: SDM670, SM8650 support - DPU: Enable SmartDMA on SM8350 and SM8450 - DP: enable runtime PM support - GPU: add metadata UAPI - GPU: move devcoredumps to GPU device - GPU: convert to drm_exec ivpu: - update FW API - new debugfs file - a new NOP job submission test mode - improve suspend/resume - PM improvements - MMU PT optimizations - firmware profile frequency support - support for uncached buffers - switch to gem shmem helpers - replace kthread with threaded irqs rockchip: - rk3066_hdmi: convert to atomic - vop2: support nv20 and nv30 - rk3588 support mediatek: - use devm_platform_ioremap_resource - stop using iommu_present - MT8188 VDOSYS1 display support panfrost: - PM improvements - improve interrupt handling as poweroff qaic: - allow to run with single MSI - support host/device time sync - switch to persistent DRM devices exynos: - fix potential error pointer dereference - fix wrong error checking - add missing call to drm_atomic_helper_shutdown omapdrm: - dma-fence lockdep annotation fix tidss: - dma-fence lockdep annotation fix - support for AM62A7 v3d: - BCM2712 - rpi5 support - fdinfo + gputop support - uapi for CPU job handling virtio-gpu: - add context debug name" * tag 'drm-next-2024-01-10' of git://anongit.freedesktop.org/drm/drm: (2340 commits) drm/amd/display: Allow z8/z10 from driver drm/amd/display: fix bandwidth validation failure on DCN 2.1 drm/amdgpu: apply the RV2 system aperture fix to RN/CZN as well drm/amd/display: Move fixpt_from_s3132 to amdgpu_dm drm/amd/display: Fix recent checkpatch errors in amdgpu_dm Revert "drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole" drm/amd/display: avoid stringop-overflow warnings for dp_decide_lane_settings() drm/amd/display: Fix power_helpers.c codestyle drm/amd/display: Fix hdcp_log.h codestyle drm/amd/display: Fix hdcp2_execution.c codestyle drm/amd/display: Fix hdcp_psp.h codestyle drm/amd/display: Fix freesync.c codestyle drm/amd/display: Fix hdcp_psp.c codestyle drm/amd/display: Fix hdcp1_execution.c codestyle drm/amd/pm/smu7: fix a memleak in smu7_hwmgr_backend_init drm/amdkfd: Fix iterator used outside loop in 'kfd_add_peer_prop()' drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()' drm/amdkfd: Confirm list is non-empty before utilizing list_first_entry in kfd_topology.c drm/amdgpu: Fix '*fw' from request_firmware() not released in 'amdgpu_ucode_request()' drm/amdgpu: Fix variable 'mca_funcs' dereferenced before NULL check in 'amdgpu_mca_smu_get_mca_entry()' ...
| * accel/habanalabs: fix information leak in sec_attest_info()Xingyuan Mo2023-12-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This function may copy the pad0 field of struct hl_info_sec_attest to user mode which has not been initialized, resulting in leakage of kernel heap data to user mode. To prevent this, use kzalloc() to allocate and zero out the buffer, which can also eliminate other uninitialized holes, if any. Fixes: 0c88760f8f5e ("habanalabs/gaudi2: add secured attestation info uapi") Signed-off-by: Xingyuan Mo <hdthky0@gmail.com> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs/gaudi2: avoid overriding existing undefined opcode dataTomer Tayar2023-12-191-21/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Part of the undefined opcode data is updated in gaudi2_handle_qman_err_generic() and some in handle_lower_qman_data_on_err(). However, the 'write_enable' flag is checked only in gaudi2_handle_qman_err_generic(), and information of more than a single error can be mixed there. Moreover, handle_lower_qman_data_on_err() is called only for the lower QMAN, so for an error in the upper QMAN there is only a partial info. Move all the data update to be done in a single place, protected by the 'write_enable' flag. As mainly the lower QMAN's info is interesting, avoid saving the partial info for the upper QMAN. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs: add parent_device sysfs attributeTomer Tayar2023-12-192-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The device debugfs directory was modified to be named as the device-name. This name is the parent device name, i.e. either the PCI address in case of an ASIC, or the simulator device name in case of a simulator. This change makes it more difficult for a user to access the debugfs directory for a specific accel device, because he can't just use the accel minor id, but he needs to do more device-dependent operations to get the device name. To make it easier to get this name, add a 'parent_device' sysfs attribute that the user can read using the minor id before accessing debugfs. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs/gaudi2: add zero padding when printing QM CP instructionTomer Tayar2023-12-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | QM instructions are in multiples of 64 bits and the command type is in the upper bits of first QWORD. To make it clearer that an undefined command is due to a type of 0x0, always print all 64 bits and add a zero padding if needed. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs: report 3 instances of Infineon second stageAriel Suller2023-12-191-2/+18
| | | | | | | | | | | | | | | | | | Infineon controller second stage has 3 instances that their version need to be reported by driver. Signed-off-by: Ariel Suller <asuller@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs/gaudi2: add signed dev info uAPIMoti Haimovski2023-12-193-0/+63
| | | | | | | | | | | | | | | | | | User will provide a nonce via the INFO ioctl, and will retrieve the signed device info generated using given nonce. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs/gaudi2: use correct registers to dump QM CQ infoTomer Tayar2023-12-192-12/+12
| | | | | | | | | | | | | | | | | | | | | | The QM CQ PTR_LO/PTR_HI/TSIZE registers are for pushing a CQ entry, and although they are updated by HW even when descriptors are fetched by PQ and CB addresses are fed into CQ, the correct registers to use when dumping the CQ info are the ones with the _STS suffix. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs: expose module id through sysfsDani Liberman2023-12-191-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Module ID exposes the physical location of the device in the server, from the pov of the devices in regard to how they are connected by internal fabric. This information is already exposed in our INFO ioctl, but there are utilities and scripts running in data-center which are already accessing sysfs for topology information and it is easier for them to continue getting that information from sysfs instead of opening a file descriptor. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs: print error code when mapping failsDani Liberman2023-12-191-3/+4
| | | | | | | | | | | | | | | | | | Failure to map is considered a non-trivial error and we need to notify the user about it. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs/gaudi2: get the correct QM CQ info upon an errorTomer Tayar2023-12-192-22/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upon a QM error, the address/size from both the CQ and the ARC_CQ are printed, although the instruction that led to the error was received from only one of them. Moreover, in case of a QM undefined opcode, only one of these address/size sets will be captured based on the value of ARC_CQ_PTR. However, this value can be non-zero even if currently the CQ is used, in case the CQ/ARC_CQ are alternately used. Under the assumption of having a stop-on-error configuration, modify to use CP_STS.CUR_CQ field to get the relevant CQ for the QM error. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs: set hard reset flag if graceful reset is skippedTomer Tayar2023-12-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | hl_device_cond_reset() might be called with the hard reset flag unset, because a compute reset upon device release as part of a graceful reset is valid. If the conditions for graceful reset are not met, hl_device_reset() will be called for an immediate reset. In this case a compute reset is not valid, so it will be replaced with a hard reset together with a debug message about it. This message might be confusing, as it implies that a compute reset was requested when it shouldn't. To prevent this confusion, set the hard reset flag in hl_device_cond_reset() if going to an immediate reset. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs: remove 'get temperature' debug printOfir Bitton2023-12-191-4/+0
| | | | | | | | | | | | | | | | | | The print was added long back for a specific debug and can now be removed. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs/gaudi2: fix undef opcode reportingDafna Hirschfeld2023-12-191-8/+6
| | | | | | | | | | | | | | | | | | | | currently the undefined opcode event bit in set only for lower cp and only if 'write_enable' is true. It should be set anyway and for all streams in order to report that event to userspace. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs: fix EQ heartbeat mechanismFarah Kassabri2023-12-191-7/+7
| | | | | | | | | | | | | | | | | | Stop rescheduling another heartbeat check when EQ heartbeat check fails as it generates confusing logs in dmesg that the heartbeat fails. Signed-off-by: Farah Kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs: add support for Gaudi2C deviceOded Gabbay2023-12-196-0/+13
| | | | | | | | | | | | | | Gaudi2 with PCI revision ID with the value of '3' represents Gaudi2C device and should be detected and initialized as Gaudi2. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs: add log when eq event is not receivedFarah Kassabri2023-12-191-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | Add error log when no eq event is received from FW, to cover a scenario when FW is stuck for some reason. In such case driver will not receive neither the eq error interrupt or the eq heartbeat event, and will just initiate a reset without indication in the dmesg about the reason. Signed-off-by: Farah Kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs/gaudi2: assume hard-reset by FW upon PCIe AXI drainTomer Tayar2023-12-192-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | When a PCIe AXI drain event happens, it is possible that the driver cannot access the device through PCIe, and therefore cannot send a hard-reset request to FW. Starting from FW version 1.13, FW will initiate a hard-reset in such a case without waiting for a reset request from the driver. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs: update device boot error checkFarah Kassabri2023-12-191-83/+32
| | | | | | | | | | | | | | | | | | | | Use a predefined mask which set the device critical boot errors. Driver will fail and stop its loading, only upon detecting at least one of those errors defined in this mask. Signed-off-by: Farah Kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel/habanalabs: add pcie reset prepare/done hooksfarah kassabri2023-12-191-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When working on a bare-metal system, if FLR will happen the firmware will handle it and driver will have no knowledge of it, and this will cause two issues: 1.The driver will be in operational state while it should be in reset. This will cause the heartbeat mechanism to keep sending messages to FW while pci device is in reset. Eventually heartbeat will fail and the device will end up in non-operational state. 2. After FW handles the FLR, and due to the reset it'll go back to preboot stage, and driver need to perform hard reset in order to load the boot fit binary. This patch will add reset_prepare hook that will set the device to be in disabled state, so it'll be not operational, and also reset_done hook which will be called after the actual FLR handling, then it will perform hard reset. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * accel: Include <drm/drm_auth.h>Thomas Zimmermann2023-12-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | One of the source files includes <drm/drm_auth.h> via <drm/drm_legacy.h>, which will be removed. Include drm_auth.h directly. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: David Airlie <airlied@gmail.com> Reviewed-by: Daniel Vetter <daniel@ffwll.ch> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231122122449.11588-6-tzimmermann@suse.de
| * accel/qaic: Expand DRM device lifecycleCarl Vanderlip2023-12-013-29/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the QAIC DRM device registers itself when the MHI QAIC_CONTROL channel becomes available. This is when the device is able to process workloads. However, the DRM driver also provides the debugfs interface bootlog for the device. If the device fails to boot to the QSM (which brings up the MHI QAIC_CONTROL channel), the bootlog won't be available for debugging why it failed to boot. Change when the DRM device registers itself from when QAIC_CONTROL is available to when the card is first probed on the PCI bus. Additionally, make the DRM driver persist through reset/error cases so the driver doesn't have to be reloaded to access the card again. Send KOBJ_ONLINE/OFFLINE uevents so userspace can know when DRM device is ready to handle requests. Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231117174337.20174-3-quic_jhugo@quicinc.com
| * accel/qaic: Increase number of in_reset statesCarl Vanderlip2023-12-014-18/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 'in_reset' holds the state of the device. As part of bringup, the device needs to be queried to check if it's in a valid state. Add a new state that indicates that the device is coming up, but not ready for users yet. Rename to 'dev_state' to better describe the variable. Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231117174337.20174-2-quic_jhugo@quicinc.com
| * Merge v6.7-rc3 into drm-nextDaniel Vetter2023-11-281-22/+21
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Thomas Zimermann needs 8d6ef26501 ("drm/ast: Disconnect BMC if physical connector is connected") for further ast work in -next. Minor conflicts in ivpu between 3de6d9597892 ("accel/ivpu: Pass D0i3 residency time to the VPU firmware") and 3f7c0634926d ("accel/ivpu/37xx: Fix hangs related to MMIO reset") changing adjacent lines. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
| * | accel/qaic: Update MAX_ORDER use to be inclusiveJeffrey Hugo2023-11-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MAX_ORDER was redefined so that valid allocations to the page allocator are in the range of 0..MAX_ORDER, inclusive in the commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely"). We are treating MAX_ORDER as an exclusive value, and thus could be requesting larger allocations. Update our use to match the redefinition of MAX_ORDER. Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231103153302.20642-1-quic_jhugo@quicinc.com
| * | accel/ivpu: Use threaded IRQ to handle JOB done messagesJacek Lawrynowicz2023-11-168-201/+199
| | | | | | | | | | | | | | | | | | | | | | | | | | | Remove job_done thread and replace it with generic callback based mechanism. Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231113170252.758137-6-jacek.lawrynowicz@linux.intel.com
| * | accel/ivpu: Use dedicated work for job timeout detectionStanislaw Gruszka2023-11-163-15/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change to use work for timeout detection. Needed for thread_irq conversion. Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231113170252.758137-5-jacek.lawrynowicz@linux.intel.com
| * | accel/ivpu: Do not use cons->aborted for job_done_threadStanislaw Gruszka2023-11-162-10/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allow to simplify ivpu_ipc_receive() as now we do not have to process all messages in aborted state - they will be freed in ivpu_ipc_consumer_del(). Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231113170252.758137-4-jacek.lawrynowicz@linux.intel.com
| * | accel/ivpu: Do not use irqsave in ivpu_ipc_dispatchStanislaw Gruszka2023-11-161-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ivpu_ipc_dispatch is always called with irqs disabled. Add lockdep assertion and remove unneeded _irqsave/_irqrestore. Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231113170252.758137-3-jacek.lawrynowicz@linux.intel.com
| * | accel/ivpu: Rename cons->rx_msg_lockStanislaw Gruszka2023-11-162-15/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | Now the cons->rx_msg_lock also protects 'abort' field so rename the lock. Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231113170252.758137-2-jacek.lawrynowicz@linux.intel.com
| * | Merge drm/drm-next into drm-misc-nextMaxime Ripard2023-11-1538-3167/+1789
| |\ \ | | | | | | | | | | | | | | | | | | | | Let's kickstart the v6.8 release cycle. Signed-off-by: Maxime Ripard <mripard@kernel.org>
| * | | accel/ivpu: Use GEM shmem helper for all buffersJacek Lawrynowicz2023-11-085-460/+144
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use struct drm_gem_shmem_object as a base for struct ivpu_bo. This cuts by 50% the buffer management code. Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231031073156.1301669-5-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu: Remove support for uncached buffersJacek Lawrynowicz2023-11-082-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Usages of DRM_IVPU_BO_UNCACHED should be replaced by DRM_IVPU_BO_WC. There is no functional benefit from DRM_IVPU_BO_UNCACHED if these buffers are never mapped to host VM. This allows to cut the buffer handling code in the kernel driver by half. Usage of DRM_IVPU_BO_UNCACHED buffers was removed from user-space driver and will not be part of first UMD release. Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231031073156.1301669-4-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu: Fix locking in ivpu_bo_remove_all_bos_from_context()Jacek Lawrynowicz2023-11-087-90/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ivpu_bo_remove_all_bos_from_context() could race with ivpu_bo_free() when prime buffer was closed after vpu device was closed. Move the bo_list from context to vdev and use a dedicated lock to sync it. This list is not modified when BO is added/removed from a context. Also rename ivpu_bo_free_vpu_addr() to ivpu_bo_unbind() because this function does more then just free vpu_addr. Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231031073156.1301669-3-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu: Allocate vpu_addr in gem->open() callbackJacek Lawrynowicz2023-11-083-37/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use gem->open() callback to simplify the code and prepare for gem_shmem conversion. It is called during handle creation for a gem object, during prime import and in BO_CREATE ioctl. Hence can be used for vpu_addr allocation. On the way remove unused bo->user_ptr field. Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231031073156.1301669-2-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu: Fix compilation with CONFIG_PM=nJacek Lawrynowicz2023-11-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use pm_runtime_status_suspended() instead of dev->power.runtime_status field that is not available without PM. Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://lore.kernel.org/all/20231106130827.1600948-1-jacek.lawrynowicz@linux.intel.com
| * | | accel/qaic: Support for 0 resize slice execution in BOPranjal Ramajor Asha Kanojiya2023-11-031-61/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to partially execute a slice which is resized to zero. Executing a zero size slice in a BO should mean that there is no DMA transfers involved but you should still configure doorbell and semaphores. For example consider a BO of size 18K and it is sliced into 3 6K slices and user calls partial execute ioctl with resize as 10K. slice 0 - size is 6k and offset is 0, so resize of 10K will not cut short this slice hence we send the entire slice for execution. slice 1 - size is 6k and offset is 6k, so resize of 10K will cut short this slice and only the first 4k should be DMA along with configuring doorbell and semaphores. slice 2 - size is 6k and offset is 12k, so resize of 10k will cut short this slice and no DMA transfer would be involved but we should would configure doorbell and semaphores. This change begs to change the behavior of 0 resize. Currently, 0 resize partial execute ioctl behaves exactly like execute ioctl i.e. no resize. After this patch all the slice in BO should behave exactly like slice 2 in above example. Refactor copy_partial_exec_reqs() to make it more readable and less complex. Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231027164330.11978-1-quic_jhugo@quicinc.com
| * | | accel/qaic: Quiet array bounds check on DMA abort messageCarl Vanderlip2023-11-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current wrapper is right-sized to the message being transferred; however, this is smaller than the structure defining message wrappers since the trailing element is a union of message/transfer headers of various sizes (8 and 32 bytes on 32-bit system where issue was reported). Using the smaller header with a small message (wire_trans_dma_xfer is 24 bytes including header) ends up being smaller than a wrapper with the larger header. There are no accesses outside of the defined size, however they are possible if the larger union member is referenced. Abort messages are outside of hot-path and changing the wrapper struct would require a larger rewrite, so having the memory allocated to the message be 8 bytes too big is acceptable. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202310182253.bcb9JcyJ-lkp@intel.com/ Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231027180810.4873-1-quic_jhugo@quicinc.com
| * | | accel/ivpu: Rename VPU to NPU in product stringsJacek Lawrynowicz2023-10-312-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VPU was rebranded as NPU (Neural Processing Unit) so user facing strings have to be updated but the code remains as is and the module is still called intel_vpu.ko. Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-9-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu: Simplify MMU SYNC commandJacek Lawrynowicz2023-10-311-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CMD_SYNC does not need any args as we poll for completion anyway. Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-8-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu: Make DMA allocations for MMU600 write combinedKarol Wachowski2023-10-311-52/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously using dma_alloc_wc() API we created cache coherent (mapped as write-back) mappings. Because we disable MMU600 snooping it was required to do costly page walk and cache flushes after each page table modification. With write-combined buffers it's possible to do a single write memory barrier to flush write-combined buffer to memory which simplifies the driver and significantly reduce time of map/unmap operations. Mapping time of 255 MB is reduced from 2.5 ms to 500 us. Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-7-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu: Print CMDQ errors after consumer timeoutKarol Wachowski2023-10-311-3/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add checking of error reason bits in IVPU_MMU_CMDQ_CONS register when waiting for consumer timeout occurred. Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-6-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu: Abort pending rx ipc on resetStanislaw Gruszka2023-10-313-4/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Waking up process, which wait for particular condition, will go to sleep again on wake_up() if the condition is not met. Add abort flag to wake up IPC receivers, which will finish with -ECANCELED error. This is only needed for reset, run time power management prevent to suspend VPU when there is pending IPC processing or pending job. Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-5-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu: Stop job_done_thread on suspendStanislaw Gruszka2023-10-314-6/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stop job_done thread when going to suspend. Use kthread_park() instead of kthread_stop() to avoid memory allocation and potential failure on resume. Use separate function as thread wake up condition. Use spin lock to assure rx_msg_list is properly protected against concurrent access. This avoid race condition when the rx_msg_list list is modified and read in ivpu_ipc_recive() at the same time. Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-4-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu: Assure device is off if power up sequence failStanislaw Gruszka2023-10-312-14/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should not leave device half enabled if there is failure somewhere it power up sequence. Fix device init and resume paths. Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-3-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu/40xx: Allow to change profiling frequencyKrystian Pradzynski2023-10-315-0/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Profiling freq is a debug firmware feature. It switches default clock to higher resolution for fine-grained and more accurate firmware task profiling. We already configure it during boot up of VPU4. Add debugfs knob and helpers per HW generation that allow to change it. For vpu37xx the implementation is empty as profiling frequency can only be changed on VPU4 or newer. Signed-off-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-2-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu: Add support for delayed D0i3 entry messageAndrzej Kacprowski2023-10-308-8/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the VPU firmware prepares for D0i3 every time the VPU is entering D0i2 Idle state. This is not optimal as we might not enter D0i3 every time we enter D0i2 Idle and this preparation is quite costly. This optimization moves D0i3 preparation to a dedicated message sent from the host driver only when the driver is about to enter D0i3 - this reduces power consumption and latency for certain workloads, for example audio workloads that submit inference every 10 ms. The VPU needs non zero time to enter IDLE state after responding to D0i3 entry message. If the driver does not wait for the VPU to enter IDLE state it could cause warm boot failures. Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-12-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu/37xx: Print warning when VPUIP is not idle during power downStanislaw Gruszka2023-10-301-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Print warning if VPUIP is not idle during power down. Use warn log level also when we fail to enter reset state as this is not really an error but unexpected behavior. Reviewed-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-11-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu: Introduce ivpu_ipc_send_receive_active()Karol Wachowski2023-10-302-14/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split ivpu_ipc_send_receive() implementation to have a version that does not call pm_runtime_resume_and_get(). That implementation can be invoked when device is up and runtime resume is prohibited (for example at the end of boot sequence). The new function will be used for D0i3 entry IPC message addition in the separate change. Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-10-stanislaw.gruszka@linux.intel.com
| * | | accel/ivpu: Pass D0i3 residency time to the VPU firmwareAndrzej Kacprowski2023-10-304-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The firmware needs to know the time spent in D0i3/D3 to calculate telemetry data. The D0i3/D3 residency time is calculated by the driver and passed to the firmware in the boot parameters. The driver also passes VPU perf counter value captured right before entering D0i3 - this allows the VPU firmware to generate monotonic timestamps for the logs. Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-9-stanislaw.gruszka@linux.intel.com