diff options
author | Lancelot SIX <lancelot.six@amd.com> | 2024-07-12 23:22:29 +0100 |
---|---|---|
committer | Alex Deucher <alexander.deucher@amd.com> | 2024-12-10 10:26:51 -0500 |
commit | 5690011a7006f8a2ce1dbf32d733c3b1454af6da (patch) | |
tree | e75cced829d567da1097df30d58610fbe6d08546 /drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | |
parent | 549120edfda954b31ad2f0bc8e1829334d042c0c (diff) | |
download | linux-5690011a7006f8a2ce1dbf32d733c3b1454af6da.tar.gz linux-5690011a7006f8a2ce1dbf32d733c3b1454af6da.tar.bz2 linux-5690011a7006f8a2ce1dbf32d733c3b1454af6da.zip |
drm/amdkfd: Handle save/restore of lds allocated in 1280B blocks
The gfx-9 trap handler is reading LDS allocation size in 256 bytes
granularity (from SQ_WAVE_LDS_ALLOC), but it using the assumption that
this value is always even (i.e. the LDS allocation is really done in
multiple of 512 bytes). This was true so far, but gfx-950 allocates LDS
in chunks of 1280 bytes, making this assumption invalid. This can cause
the trap handler to try to save / restore past the end of LDS, and past
the LDS allocated slot in the save are, overriding data from the
following wave.
This patch updates the trap handler to support LDS allocated in 1280
bytes blocks:
- During restore, copy from main memory directly to LDS in batch of 1280
bytes.
- During save, continue to use 512 bytes blocks (we only have 2 VGPRs we
can use to hold data), making sure to mask the upper half of the wave
when handling when the LDS size is not a multiple of 512 bytes.
Signed-off-by: Lancelot SIX <lancelot.six@amd.com>
Co-authored-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c')
0 files changed, 0 insertions, 0 deletions