diff options
author | Dave Airlie <airlied@redhat.com> | 2020-11-04 10:55:11 +1000 |
---|---|---|
committer | Dave Airlie <airlied@redhat.com> | 2020-11-04 11:49:10 +1000 |
commit | 1cd260a7905e3ba2e5dfa39b110ad6cf8f466f49 (patch) | |
tree | bfb701fdb0fcb32f8e6e53fb1692361c8fa33a6a /drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | |
parent | 3cea11cd5e3b00d91caf0b4730194039b45c5891 (diff) | |
parent | 4dfec0d1d7b9970f36931de714b379dbeaed83f8 (diff) | |
download | linux-1cd260a7905e3ba2e5dfa39b110ad6cf8f466f49.tar.gz linux-1cd260a7905e3ba2e5dfa39b110ad6cf8f466f49.tar.bz2 linux-1cd260a7905e3ba2e5dfa39b110ad6cf8f466f49.zip |
Merge tag 'drm-misc-next-2020-10-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.11:
UAPI Changes:
- doc: rules for EBUSY on non-blocking commits; requirements for fourcc
modifiers; on parsing EDID
- fbdev/sbuslib: Remove unused FBIOSCURSOR32
- fourcc: deprecate DRM_FORMAT_MOD_NONE
- virtio: Support blob resources for memory allocations; Expose host-visible
and cross-device features
Cross-subsystem Changes:
- devicetree: Add vendor Prefix for Yes Optoelectronics, Shanghai Top Display
Optoelectronics
- dma-buf: Add struct dma_buf_map that stores DMA pointer and I/O-memory flag;
dma_buf_vmap()/vunmap() return address in dma_buf_map; Use struct_size() macro
Core Changes:
- atomic: pass full state to CRTC atomic enable/disable; warn for EBUSY during
non-blocking commits
- dp: Prepare for DP 2.0 DPCD
- dp_mst: Receive extended DPCD caps
- dma-buf: Documentation
- doc: Format modifiers; dma-buf-map; Cleanups
- fbdev: Don't use compat_alloc_user_space(); mark as orphaned
- fb-helper: Take lock in drm_fb_helper_restore_work_fb()
- gem: Convert implementation and drivers to GEM object functions, remove
GEM callbacks from struct drm_driver (expect gem_prime_mmap)
- panel: Cleanups
- pci: Add legacy infix to drm_irq_by_busid()
- sched: Avoid infinite waits in drm_sched_entity_destroy()
- switcheroo: Cleanups
- ttm: Remove AGP support; Don't modify caching during swapout; Major
refactoring of the implementation and API that affects all depending
drivers; Add ttm_bo_wait_ctx(); Add ttm_bo_pin()/unpin() in favor of
TTM_PL_FLAG_NO_EVICT; Remove ttm_bo_create(); Remove fault_reserve_notify()
callback; Push move() implementation into drivers; Remove TTM_PAGE_FLAG_WRITE;
Replace caching flags with init-time cache setting; Push ttm_tt_bind() into
drivers; Replace move_notify() with delete_mem_notify(); No overlapping memcpy();
no more ttm_set_populated()
- vram-helper: Fix BO top-down placement; TTM-related changes; Init GEM
object functions with defaults; Default placement in system memory; Cleanups
Driver Changes:
- amdgpu: Use GEM object functions
- armada: Use GEM object functions
- aspeed: Configure output via sysfs; Init struct drm_driver with
- ast: Reload LUT after FB format changes
- bridge: Add driver and DT bindings for anx7625; Cleanups
- bridge/dw-hdmi: Constify ops
- bridge/ti-sn65dsi86: Add retries for link training
- bridge/lvds-codec: Add support for regulator
- bridge/tc358768: Restore connector support DRM_GEM_CMA_DRIVEROPS; Cleanups
- display/ti,j721e-dss: Add DT properies assigned-clocks, assigned-clocks-parent and
dma-coherent
- display/ti,am65s-dss: Add DT properies assigned-clocks, assigned-clocks-parent and
dma-coherent
- etnaviv: Use GEM object functions
- exynos: Use GEM object functions
- fbdev: Cleanups and compiler fixes throughout framebuffer drivers
- fbdev/cirrusfb: Avoid division by 0
- gma500: Use GEM object functions; Fix double-free of connector; Cleanups
- hisilicon/hibmc: I2C-based DDC support; Use to_hibmc_drm_device(); Cleanups
- i915: Use GEM object functions
- imx/dcss: Init driver with DRM_GEM_CMA_DRIVER_OPS; Cleanups
- ingenic: Reset pixel clock when parent clock changes; support reserved
memory; Alloc F0 and F1 DMA channels at once; Support different pixel formats;
Revert support for cached mmap buffers
on F0/F1; support 30-bit/24-bit/8-bit-palette modes
- komeda: Use DEFINE_SHOW_ATTRIBUTE
- mcde: Detect platform_get_irq() errors
- mediatek: Use GEM object functions
- msm: Use GEM object functions
- nouveau: Cleanups; TTM-related changes; Use GEM object functions
- omapdrm: Use GEM object functions
- panel: Add driver and DT bindings for Novatak nt36672a; Add driver and DT
bindings for YTC700TLAG-05-201C; Add driver and DT bindings for TDO TL070WSH30;
Cleanups
- panel/mantix: Fix reset; Fix deref of NULL pointer in mantix_get_modes()
- panel/otm8009a: Allow non-continuous dsi clock; Cleanups
- panel/rm68200: Allow non-continuous dsi clock; Fix mode to 50 FPS
- panfrost: Fix job timeout handling; Cleanups
- pl111: Use GEM object functions
- qxl: Cleanups; TTM-related changes; Pin new BOs with ttm_bo_init_reserved()
- radeon: Cleanups; TTM-related changes; Use GEM object functions
- rockchip: Use GEM object functions
- shmobile: Cleanups
- tegra: Use GEM object functions
- tidss: Set drm_plane_helper_funcs.prepare_fb
- tilcdc: Don't keep vblank interrupt enabled all the time
- tve200: Detect platform_get_irq() errors
- vc4: Use GEM object functions; Only register components once DSI is attached;
Add Maxime as maintainer
- vgem: Use GEM object functions
- via: Simplify critical section in via_mem_alloc()
- virtgpu: Use GEM object functions
- virtio: Implement blob resources, host-visible and cross-device features;
Support mapping of host-allocated resources; Use UUID APi; Cleanups
- vkms: Use GEM object functions; Switch to SHMEM
- vmwgfx: TTM-related changes; Inline ttm_bo_swapout_all()
- xen: Use GEM object functions
- xlnx: Use GEM object functions
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20201027100936.GA4858@linux-uq9g
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c')
-rw-r--r-- | drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 157 |
1 files changed, 113 insertions, 44 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 8039d2399584..ddb1c8e9eea4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -66,6 +66,8 @@ static int amdgpu_ttm_backend_bind(struct ttm_bo_device *bdev, struct ttm_tt *ttm, struct ttm_resource *bo_mem); +static void amdgpu_ttm_backend_unbind(struct ttm_bo_device *bdev, + struct ttm_tt *ttm); static int amdgpu_ttm_init_on_chip(struct amdgpu_device *adev, unsigned int type, @@ -92,7 +94,7 @@ static void amdgpu_evict_flags(struct ttm_buffer_object *bo, .fpfn = 0, .lpfn = 0, .mem_type = TTM_PL_SYSTEM, - .flags = TTM_PL_MASK_CACHING + .flags = 0 }; /* Don't handle scatter gather BOs */ @@ -292,11 +294,9 @@ static int amdgpu_ttm_map_buffer(struct ttm_buffer_object *bo, cpu_addr = &job->ibs[0].ptr[num_dw]; if (mem->mem_type == TTM_PL_TT) { - struct ttm_dma_tt *dma; dma_addr_t *dma_address; - dma = container_of(bo->ttm, struct ttm_dma_tt, ttm); - dma_address = &dma->dma_address[offset >> PAGE_SHIFT]; + dma_address = &bo->ttm->dma_address[offset >> PAGE_SHIFT]; r = amdgpu_gart_map(adev, 0, num_pages, dma_address, flags, cpu_addr); if (r) @@ -538,19 +538,13 @@ static int amdgpu_move_vram_ram(struct ttm_buffer_object *bo, bool evict, placements.fpfn = 0; placements.lpfn = 0; placements.mem_type = TTM_PL_TT; - placements.flags = TTM_PL_MASK_CACHING; + placements.flags = 0; r = ttm_bo_mem_space(bo, &placement, &tmp_mem, ctx); if (unlikely(r)) { pr_err("Failed to find GTT space for blit from VRAM\n"); return r; } - /* set caching flags */ - r = ttm_tt_set_placement_caching(bo->ttm, tmp_mem.placement); - if (unlikely(r)) { - goto out_cleanup; - } - r = ttm_tt_populate(bo->bdev, bo->ttm, ctx); if (unlikely(r)) goto out_cleanup; @@ -567,8 +561,13 @@ static int amdgpu_move_vram_ram(struct ttm_buffer_object *bo, bool evict, goto out_cleanup; } - /* move BO (in tmp_mem) to new_mem */ - r = ttm_bo_move_ttm(bo, ctx, new_mem); + r = ttm_bo_wait_ctx(bo, ctx); + if (unlikely(r)) + goto out_cleanup; + + amdgpu_ttm_backend_unbind(bo->bdev, bo->ttm); + ttm_resource_free(bo, &bo->mem); + ttm_bo_assign_mem(bo, new_mem); out_cleanup: ttm_resource_free(bo, &tmp_mem); return r; @@ -599,7 +598,7 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object *bo, bool evict, placements.fpfn = 0; placements.lpfn = 0; placements.mem_type = TTM_PL_TT; - placements.flags = TTM_PL_MASK_CACHING; + placements.flags = 0; r = ttm_bo_mem_space(bo, &placement, &tmp_mem, ctx); if (unlikely(r)) { pr_err("Failed to find GTT space for blit to VRAM\n"); @@ -607,11 +606,16 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object *bo, bool evict, } /* move/bind old memory to GTT space */ - r = ttm_bo_move_ttm(bo, ctx, &tmp_mem); + r = ttm_tt_populate(bo->bdev, bo->ttm, ctx); + if (unlikely(r)) + return r; + + r = amdgpu_ttm_backend_bind(bo->bdev, bo->ttm, &tmp_mem); if (unlikely(r)) { goto out_cleanup; } + ttm_bo_assign_mem(bo, &tmp_mem); /* copy to VRAM */ r = amdgpu_move_blit(bo, evict, new_mem, old_mem); if (unlikely(r)) { @@ -660,9 +664,17 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, bool evict, struct ttm_resource *old_mem = &bo->mem; int r; + if (new_mem->mem_type == TTM_PL_TT) { + r = amdgpu_ttm_backend_bind(bo->bdev, bo->ttm, new_mem); + if (r) + return r; + } + + amdgpu_bo_move_notify(bo, evict, new_mem); + /* Can't move a pinned BO */ abo = ttm_to_amdgpu_bo(bo); - if (WARN_ON_ONCE(abo->pin_count > 0)) + if (WARN_ON_ONCE(abo->tbo.pin_count > 0)) return -EINVAL; adev = amdgpu_ttm_adev(bo->bdev); @@ -671,14 +683,24 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, bool evict, ttm_bo_move_null(bo, new_mem); return 0; } - if ((old_mem->mem_type == TTM_PL_TT && - new_mem->mem_type == TTM_PL_SYSTEM) || - (old_mem->mem_type == TTM_PL_SYSTEM && - new_mem->mem_type == TTM_PL_TT)) { - /* bind is enough */ + if (old_mem->mem_type == TTM_PL_SYSTEM && + new_mem->mem_type == TTM_PL_TT) { ttm_bo_move_null(bo, new_mem); return 0; } + + if (old_mem->mem_type == TTM_PL_TT && + new_mem->mem_type == TTM_PL_SYSTEM) { + r = ttm_bo_wait_ctx(bo, ctx); + if (r) + goto fail; + + amdgpu_ttm_backend_unbind(bo->bdev, bo->ttm); + ttm_resource_free(bo, &bo->mem); + ttm_bo_assign_mem(bo, new_mem); + return 0; + } + if (old_mem->mem_type == AMDGPU_PL_GDS || old_mem->mem_type == AMDGPU_PL_GWS || old_mem->mem_type == AMDGPU_PL_OA || @@ -712,12 +734,12 @@ memcpy: if (!amdgpu_mem_visible(adev, old_mem) || !amdgpu_mem_visible(adev, new_mem)) { pr_err("Move buffer fallback to memcpy unavailable\n"); - return r; + goto fail; } r = ttm_bo_move_memcpy(bo, ctx, new_mem); if (r) - return r; + goto fail; } if (bo->type == ttm_bo_type_device && @@ -732,6 +754,11 @@ memcpy: /* update statistics */ atomic64_add((u64)bo->num_pages << PAGE_SHIFT, &adev->num_bytes_moved); return 0; +fail: + swap(*new_mem, bo->mem); + amdgpu_bo_move_notify(bo, false, new_mem); + swap(*new_mem, bo->mem); + return r; } /** @@ -767,6 +794,7 @@ static int amdgpu_ttm_io_mem_reserve(struct ttm_bo_device *bdev, struct ttm_reso mem->bus.offset += adev->gmc.aper_base; mem->bus.is_iomem = true; + mem->bus.caching = ttm_write_combined; break; default: return -EINVAL; @@ -811,7 +839,7 @@ uint64_t amdgpu_ttm_domain_start(struct amdgpu_device *adev, uint32_t type) * TTM backend functions. */ struct amdgpu_ttm_tt { - struct ttm_dma_tt ttm; + struct ttm_tt ttm; struct drm_gem_object *gobj; u64 offset; uint64_t userptr; @@ -943,7 +971,7 @@ bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm) if (!gtt || !gtt->userptr) return false; - DRM_DEBUG_DRIVER("user_pages_done 0x%llx pages 0x%lx\n", + DRM_DEBUG_DRIVER("user_pages_done 0x%llx pages 0x%x\n", gtt->userptr, ttm->num_pages); WARN_ONCE(!gtt->range || !gtt->range->hmm_pfns, @@ -1095,7 +1123,7 @@ static int amdgpu_ttm_gart_bind(struct amdgpu_device *adev, gart_bind_fail: if (r) - DRM_ERROR("failed to bind %lu pages at 0x%08llX\n", + DRM_ERROR("failed to bind %u pages at 0x%08llX\n", ttm->num_pages, gtt->offset); return r; @@ -1130,7 +1158,7 @@ static int amdgpu_ttm_backend_bind(struct ttm_bo_device *bdev, } } if (!ttm->num_pages) { - WARN(1, "nothing to bind %lu pages for mreg %p back %p!\n", + WARN(1, "nothing to bind %u pages for mreg %p back %p!\n", ttm->num_pages, bo_mem, ttm); } @@ -1153,7 +1181,7 @@ static int amdgpu_ttm_backend_bind(struct ttm_bo_device *bdev, ttm->pages, gtt->ttm.dma_address, flags); if (r) - DRM_ERROR("failed to bind %lu pages at 0x%08llX\n", + DRM_ERROR("failed to bind %u pages at 0x%08llX\n", ttm->num_pages, gtt->offset); gtt->bound = true; return r; @@ -1267,8 +1295,8 @@ static void amdgpu_ttm_backend_unbind(struct ttm_bo_device *bdev, /* unbind shouldn't be done for GDS/GWS/OA in ttm_bo_clean_mm */ r = amdgpu_gart_unbind(adev, gtt->offset, ttm->num_pages); if (r) - DRM_ERROR("failed to unbind %lu pages at 0x%08llX\n", - gtt->ttm.ttm.num_pages, gtt->offset); + DRM_ERROR("failed to unbind %u pages at 0x%08llX\n", + gtt->ttm.num_pages, gtt->offset); gtt->bound = false; } @@ -1282,7 +1310,7 @@ static void amdgpu_ttm_backend_destroy(struct ttm_bo_device *bdev, if (gtt->usertask) put_task_struct(gtt->usertask); - ttm_dma_tt_fini(>t->ttm); + ttm_tt_fini(>t->ttm); kfree(gtt); } @@ -1296,7 +1324,9 @@ static void amdgpu_ttm_backend_destroy(struct ttm_bo_device *bdev, static struct ttm_tt *amdgpu_ttm_tt_create(struct ttm_buffer_object *bo, uint32_t page_flags) { + struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo); struct amdgpu_ttm_tt *gtt; + enum ttm_caching caching; gtt = kzalloc(sizeof(struct amdgpu_ttm_tt), GFP_KERNEL); if (gtt == NULL) { @@ -1304,12 +1334,17 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct ttm_buffer_object *bo, } gtt->gobj = &bo->base; + if (abo->flags & AMDGPU_GEM_CREATE_CPU_GTT_USWC) + caching = ttm_write_combined; + else + caching = ttm_cached; + /* allocate space for the uninitialized page entries */ - if (ttm_sg_tt_init(>t->ttm, bo, page_flags)) { + if (ttm_sg_tt_init(>t->ttm, bo, page_flags, caching)) { kfree(gtt); return NULL; } - return >t->ttm.ttm; + return >t->ttm; } /** @@ -1332,7 +1367,6 @@ static int amdgpu_ttm_tt_populate(struct ttm_bo_device *bdev, return -ENOMEM; ttm->page_flags |= TTM_PAGE_FLAG_SG; - ttm_tt_set_populated(ttm); return 0; } @@ -1352,7 +1386,6 @@ static int amdgpu_ttm_tt_populate(struct ttm_bo_device *bdev, drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages, gtt->ttm.dma_address, ttm->num_pages); - ttm_tt_set_populated(ttm); return 0; } @@ -1478,7 +1511,7 @@ bool amdgpu_ttm_tt_affect_userptr(struct ttm_tt *ttm, unsigned long start, /* Return false if no part of the ttm_tt object lies within * the range */ - size = (unsigned long)gtt->ttm.ttm.num_pages * PAGE_SIZE; + size = (unsigned long)gtt->ttm.num_pages * PAGE_SIZE; if (gtt->userptr > end || gtt->userptr + size <= start) return false; @@ -1529,7 +1562,7 @@ uint64_t amdgpu_ttm_tt_pde_flags(struct ttm_tt *ttm, struct ttm_resource *mem) if (mem && mem->mem_type == TTM_PL_TT) { flags |= AMDGPU_PTE_SYSTEM; - if (ttm->caching_state == tt_cached) + if (ttm->caching == ttm_cached) flags |= AMDGPU_PTE_SNOOPED; } @@ -1699,20 +1732,23 @@ static int amdgpu_ttm_access_memory(struct ttm_buffer_object *bo, return ret; } +static void +amdgpu_bo_delete_mem_notify(struct ttm_buffer_object *bo) +{ + amdgpu_bo_move_notify(bo, false, NULL); +} + static struct ttm_bo_driver amdgpu_bo_driver = { .ttm_tt_create = &amdgpu_ttm_tt_create, .ttm_tt_populate = &amdgpu_ttm_tt_populate, .ttm_tt_unpopulate = &amdgpu_ttm_tt_unpopulate, - .ttm_tt_bind = &amdgpu_ttm_backend_bind, - .ttm_tt_unbind = &amdgpu_ttm_backend_unbind, .ttm_tt_destroy = &amdgpu_ttm_backend_destroy, .eviction_valuable = amdgpu_ttm_bo_eviction_valuable, .evict_flags = &amdgpu_evict_flags, .move = &amdgpu_bo_move, .verify_access = &amdgpu_verify_access, - .move_notify = &amdgpu_bo_move_notify, + .delete_mem_notify = &amdgpu_bo_delete_mem_notify, .release_notify = &amdgpu_bo_release_notify, - .fault_reserve_notify = &amdgpu_bo_fault_reserve_notify, .io_mem_reserve = &amdgpu_ttm_io_mem_reserve, .io_mem_pfn = amdgpu_ttm_io_mem_pfn, .access_memory = &amdgpu_ttm_access_memory, @@ -2092,15 +2128,48 @@ void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, bool enable) adev->mman.buffer_funcs_enabled = enable; } +static vm_fault_t amdgpu_ttm_fault(struct vm_fault *vmf) +{ + struct ttm_buffer_object *bo = vmf->vma->vm_private_data; + vm_fault_t ret; + + ret = ttm_bo_vm_reserve(bo, vmf); + if (ret) + return ret; + + ret = amdgpu_bo_fault_reserve_notify(bo); + if (ret) + goto unlock; + + ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot, + TTM_BO_VM_NUM_PREFAULT, 1); + if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) + return ret; + +unlock: + dma_resv_unlock(bo->base.resv); + return ret; +} + +static struct vm_operations_struct amdgpu_ttm_vm_ops = { + .fault = amdgpu_ttm_fault, + .open = ttm_bo_vm_open, + .close = ttm_bo_vm_close, + .access = ttm_bo_vm_access +}; + int amdgpu_mmap(struct file *filp, struct vm_area_struct *vma) { struct drm_file *file_priv = filp->private_data; struct amdgpu_device *adev = drm_to_adev(file_priv->minor->dev); + int r; - if (adev == NULL) - return -EINVAL; + r = ttm_bo_mmap(filp, vma, &adev->mman.bdev); + if (unlikely(r != 0)) + return r; - return ttm_bo_mmap(filp, vma, &adev->mman.bdev); + vma->vm_ops = &amdgpu_ttm_vm_ops; + return 0; } int amdgpu_copy_buffer(struct amdgpu_ring *ring, uint64_t src_offset, |