diff options
author | Matthew Auld <matthew.auld@intel.com> | 2023-12-13 17:47:04 +0000 |
---|---|---|
committer | Matthew Auld <matthew.auld@intel.com> | 2024-01-09 14:26:58 +0000 |
commit | 29f424eb8702b686cb6f07ddd659c6312e0c796d (patch) | |
tree | affd6094b18a5ad4e1a818247c362adf49667ae7 /drivers/gpu/drm/xe/xe_exec.c | |
parent | fa78e188d8d1df850eb232a2631012093aeeb0e0 (diff) | |
download | linux-stable-29f424eb8702b686cb6f07ddd659c6312e0c796d.tar.gz linux-stable-29f424eb8702b686cb6f07ddd659c6312e0c796d.tar.bz2 linux-stable-29f424eb8702b686cb6f07ddd659c6312e0c796d.zip |
drm/xe/exec: move fence reservation
We currently assume that we can upfront know exactly how many fence
slots we will need at the start of the exec, however the TTM bo_validate
can itself consume numerous fence slots, and due to how the
dma_resv_reserve_fences() works it only ensures that at least that many
fence slots are available. With this it is quite possible that TTM
steals some of the fence slots and then when it comes time to do the vma
binding and final exec stage we are lacking enough fence slots, leading
to some nasty BUG_ON(). A simple fix is to reserve our own fences later,
after the validate stage.
References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/698
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Diffstat (limited to 'drivers/gpu/drm/xe/xe_exec.c')
-rw-r--r-- | drivers/gpu/drm/xe/xe_exec.c | 40 |
1 files changed, 38 insertions, 2 deletions
diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c index ba92e5619da3..63e82e5285bc 100644 --- a/drivers/gpu/drm/xe/xe_exec.c +++ b/drivers/gpu/drm/xe/xe_exec.c @@ -96,7 +96,44 @@ static int xe_exec_fn(struct drm_gpuvm_exec *vm_exec) { - return drm_gpuvm_validate(vm_exec->vm, &vm_exec->exec); + struct xe_vm *vm = container_of(vm_exec->vm, struct xe_vm, gpuvm); + struct drm_gem_object *obj; + unsigned long index; + int num_fences; + int ret; + + ret = drm_gpuvm_validate(vm_exec->vm, &vm_exec->exec); + if (ret) + return ret; + + /* + * 1 fence slot for the final submit, and one more for every per-tile + * bind. Note that there are potentially many vma per object/dma-resv, + * however the fence slot will just be re-used, since they are largely + * the same timeline and the seqno should be in order. + */ + num_fences = 1 + vm->xe->info.tile_count; + + /* + * We don't know upfront exactly how many fence slots we will need at + * the start of the exec, since the TTM bo_validate above can consume + * numerous fence slots. Also due to how the dma_resv_reserve_fences() + * works it only ensures that at least that many fence slots are + * available i.e if there are already 10 slots available and we reserve + * two more, it can just noop without reserving anything. With this it + * is quite possible that TTM steals some of the fence slots and then + * when it comes time to do the vma binding and final exec stage we are + * lacking enough fence slots, leading to some nasty BUG_ON() when + * adding the fences. Hence just add our own fences here, after the + * validate stage. + */ + drm_exec_for_each_locked_object(&vm_exec->exec, index, obj) { + ret = dma_resv_reserve_fences(obj->resv, num_fences); + if (ret) + return ret; + } + + return 0; } int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file) @@ -189,7 +226,6 @@ retry: } vm_exec.vm = &vm->gpuvm; - vm_exec.num_fences = 1 + vm->xe->info.tile_count; vm_exec.flags = DRM_EXEC_INTERRUPTIBLE_WAIT; if (xe_vm_in_lr_mode(vm)) { drm_exec_init(exec, vm_exec.flags); |