summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
Commit message (Collapse)AuthorAgeFilesLines
* drm/amdkfd: fix and enable debugging for gfx11Jonathan Kim2023-06-091-1/+6
| | | | | | | | | | | | | | | | | There are a couple of fixes required to enable gfx11 debugging. First, ADD_QUEUE.trap_en is an inappropriate place to toggle a per-process register so move it to SET_SHADER_DEBUGGER.trap_en. When ADD_QUEUE.skip_process_ctx_clear is set, MES will prioritize the SET_SHADER_DEBUGGER.trap_en setting. Second, to preserve correct save/restore priviledged wave states in coordination with the trap enablement setting, resume suspended waves early in the disable call. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: expose debug api for mesJonathan Kim2023-06-091-0/+32
| | | | | | | | | | | | | | | | | | Similar to the F32 HWS, the RS64 HWS for GFX11 now supports a multi-process debug API. The skip_process_ctx_clear ADD_QUEUE requirement is to prevent the MES from clearing the process context when the first queue is added to the scheduler in order to maintain debug mode settings during queue preemption and restore. The MES clears the process context in this case due to an unresolved FW caching bug during normal mode operations. During debug mode, the KFD will hold a reference to the target process so the process context should never go stale and MES can afford to skip this requirement. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: switch to unified amdgpu_ring_test_helperGuchun Chen2023-06-091-7/+2
| | | | | | | | This will simplify code. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amd/amdgpu: introduce gc_*_mes_2.bin v2Jack Xiao2023-04-111-4/+22
| | | | | | | | | | | | To avoid new mes fw running with old driver, rename mes schq fw to gc_*_mes_2.bin. v2: add MODULE_FIRMWARE declaration v3: squash in fixup patch Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amd/amdgpu: limit one queue per gangJack Xiao2023-03-221-6/+3
| | | | | | | | | Limit one queue per gang in mes self test, due to mes schq fw change. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amd/amdgpu/amdgpu_mes: Ensure amdgpu_bo_create_kernel()'s return value ↵Lee Jones2023-03-221-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | is checked Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c: In function ‘amdgpu_mes_ctx_alloc_meta_data’: drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1099:13: warning: variable ‘r’ set but not used [-Wunused-but-set-variable] Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Jack Xiao <Jack.Xiao@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amd: Use `amdgpu_ucode_*` helpers for MESMario Limonciello2023-01-091-8/+2
| | | | | | | | | | | | The `amdgpu_ucode_request` helper will ensure that the return code for missing firmware is -ENODEV so that early_init can fail. The `amdgpu_ucode_release` helper provides symmetry for releasing firmware. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amd: Load MES microcode during early_initMario Limonciello2023-01-091-0/+65
| | | | | | | | | | | | | | | | | Add an early_init phase to MES for fetching and validating microcode from the filesystem. If MES microcode is required but not available during early init, the firmware framebuffer will have already been released and the screen will freeze. Move the request for MES microcode into the early_init phase so that if it's not available, early_init will fail. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: zero the sdma_hqd_mask of 2nd SDMA engine for SDMA 6.0.1Yifan Zhang2022-09-141-0/+3
| | | | | | | | | | there is only one SDMA engine in SDMA 6.0.1, the sdma_hqd_mask has to be zeroed for the 2nd engine, otherwise MES scheduler will consider 2nd engine exists and map/unmap SDMA queues to the non-existent engine. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: ring aggregatged doorbell when mes queue is unmappedLe Ma2022-07-131-0/+7
| | | | | | | | | Ring aggregated doorbel to make unmapped queue scheduled in mes firmware. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: init aggregated doorbellLe Ma2022-07-131-4/+12
| | | | | | | | | Allocate and enable aggregated doorbell. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: fix bo va unmap issue in mesJack Xiao2022-07-121-3/+58
| | | | | | | | | | | Need reserve buffers before unmap mes ctx bo va. v2: fix removal of dma_resv_excl_fence() (Alex) v3: fix dma_resv_usage (Alex) Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: fix mes submission in atomic contextJack Xiao2022-07-081-15/+1
| | | | | | | | | | For some cases (accessing registers, unmap legacy queue), it needs access mes in atomic context. Use spinlock to protect agaist mes ring buffer race condition. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: Fix an error handling path in amdgpu_mes_self_test()Jianglei Nie2022-07-051-1/+3
| | | | | | | | | | | if amdgpu_mes_ctx_alloc_meta_data() fails, we should call amdgpu_vm_fini() to handle amdgpu_vm_init(). Add a new lable before amdgpu_vm_init() and goto this lable when amdgpu_mes_ctx_alloc_meta_data() fails. Signed-off-by: Jianglei Nie <niejianglei2021@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: add mes register access interfaceJack Xiao2022-06-301-1/+131
| | | | | | | | | | | | Add mes register access routines: 1. read register 2. write register 3. wait register 4. write and wait register Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add mc wptr addr support for mesJack Xiao2022-06-281-2/+8
| | | | | | | | | MES requires mc wptr address for usermode queues. Export bo gart address for mc wptr address. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: Update mes_v11_api_def.hGraham Sider2022-06-231-0/+1
| | | | | | | | | | | | | Update MES API to support oversubscription without aggregated doorbell for usermode queues. v2: Change oversubscription_no_aggregated_en to is_kfd_process (align with MES) Signed-off-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: fix format specifier for size_tAlex Deucher2022-05-101-1/+1
| | | | | | | | To avoid a warning on 32 bit. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add mes unmap legacy queue routineJack Xiao2022-05-041-123/+212
| | | | | | | | | | For mes kiq has been taken over by mes sched, drv can't directly use mes kiq to unmap queues. drv has to use mes sched api to unmap legacy queue. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: Update the doorbell function signaturesMukul Joshi2022-05-041-15/+22
| | | | | | | | | | | | | | Update the function signatures for process doorbell allocations with MES enabled to make them more generic. KFD would need to access these functions to allocate/free doorbells when MES is enabled. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Acked-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: disable mes sdma queue testJack Xiao2022-05-041-0/+5
| | | | | | | | | | Disable mes sdma queue test on sienna cichlid+, for fw hasn't supported to map sdma queue. The test can be enabled if fw supports. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: fix vm csa update issueJack Xiao2022-05-041-22/+59
| | | | | | | | | | Need reserve VM buffers before update VM csa. v2: rebase fixes Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: implement mes self testJack Xiao2022-05-041-0/+97
| | | | | | | | | | Add mes self test to verify its fundamental functionality by running ring test and ib test of mes kernel queue. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: add ring/ib test for mes self testJack Xiao2022-05-041-0/+32
| | | | | | | | | Run the ring test and ib test for mes self test. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: create gang and queues for mes self testJack Xiao2022-05-041-0/+39
| | | | | | | | | Create gang and queues for mes self test. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: map ctx metadata for mes self testJack Xiao2022-05-041-0/+37
| | | | | | | | | Map ctx metadata for mes self test. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: add helper functions to alloc/free ctx metadataJack Xiao2022-05-041-0/+25
| | | | | | | | | Add the helper functions to allocate/free context metadata. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: implement removing mes ringJack Xiao2022-05-041-0/+11
| | | | | | | | | Remove the mes ring and its resources. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: use ring for kernel queue submissionJack Xiao2022-05-041-0/+93
| | | | | | | | | Use ring as the front end for kernel queue submission. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: add helper function to get the ctx meta data offsetJack Xiao2022-05-041-0/+36
| | | | | | | | | Add the helper function to get the corresponding ctx meta data offset. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: add helper function to convert ring to queue propertyJack Xiao2022-05-041-0/+17
| | | | | | | | | Add the helper function to convert ring to queue property. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: implement removing mes queueJack Xiao2022-05-041-0/+45
| | | | | | | | | Remove the MES queue from MES scheduling and free its resources. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: implement adding mes queueJack Xiao2022-05-041-0/+105
| | | | | | | | | | Allocate related resources for the queue and add it to mes for scheduling. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: initialize mqd from queue propertiesJack Xiao2022-05-041-0/+54
| | | | | | | | | Add helper function to initialize mqd from queue properties. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: implement resuming all gangsJack Xiao2022-05-041-0/+25
| | | | | | | | | Implement resuming all gangs. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: implement suspending all gangsJack Xiao2022-05-041-0/+25
| | | | | | | | | Implement suspending all gangs. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: implement removing mes gangJack Xiao2022-05-041-0/+30
| | | | | | | | | Free the mes gang and its resources. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: implement adding mes gangJack Xiao2022-05-041-0/+67
| | | | | | | | | | Gang is a group of the same type queue, which is the scheduling unit of mes hardware scheduler. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: implement destroying mes processJack Xiao2022-05-041-0/+58
| | | | | | | | | Destroy the mes process, which free resources of the process. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: implement creating mes process v2Jack Xiao2022-05-041-0/+77
| | | | | | | | | | | | Create a mes process which contains process-related resources, like vm, doorbell bitmap, process ctx bo and etc. v2: move the simple variable to the end Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: relocate status_fence slot allocationJack Xiao2022-05-041-0/+11
| | | | | | | | | | Move the status_fence slot allocation from ip specific function to general mes function. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: initialize/finalize common mes structure v2Jack Xiao2022-05-041-0/+72
| | | | | | | | | | | | Initialize/finalize common mes structure. v2: add mutex_init for adev->mes.mutex Cc: Le Ma <le.ma@amd.com> Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/mes: manage mes doorbell allocationJack Xiao2022-05-041-0/+133
It is used to manage the doorbell allocation of mes processes and queues. Driver calls into process doorbell allocation to get the slice doorbell for the process, then the doorbell for a queue is allocated from the process doorbell slice. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>