summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/i915_gem.c
Commit message (Collapse)AuthorAgeFilesLines
* drm/i915: Avoid unmapping pages from a NULL address spaceChris Wilson2011-03-231-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Found by gem_stress. As we perform retirement from a workqueue, it is possible for us to free and unbind objects after the last close on the device, and so after the address space has been torn down and reset to NULL: BUG: unable to handle kernel NULL pointer dereference at 00000054 IP: [<c1295a20>] mutex_lock+0xf/0x27 *pde = 00000000 Oops: 0002 [#1] SMP last sysfs file: /sys/module/vt/parameters/default_utf8 Pid: 5, comm: kworker/u:0 Not tainted 2.6.38+ #214 EIP: 0060:[<c1295a20>] EFLAGS: 00010206 CPU: 1 EIP is at mutex_lock+0xf/0x27 EAX: 00000054 EBX: 00000054 ECX: 00000000 EDX: 00012fff ESI: 00000028 EDI: 00000000 EBP: f706fe20 ESP: f706fe18 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process kworker/u:0 (pid: 5, ti=f706e000 task=f7060d00 task.ti=f706e000) Stack: f5aa3c60 00000000 f706fe74 c107e7df 00000246 dea55380 00000054 f5aa3c60 f706fe44 00000061 f70b4000 c13fff84 00000008 f706fe54 00000000 00000000 00012f00 00012fff 00000028 c109e575 f6b36700 00100000 00000000 f706fe90 Call Trace: [<c107e7df>] unmap_mapping_range+0x7d/0x1e6 [<c109e575>] ? mntput_no_expire+0x52/0xb6 [<c11c12f6>] i915_gem_release_mmap+0x49/0x58 [<c11c3449>] i915_gem_object_unbind+0x4c/0x125 [<c11c353f>] i915_gem_free_object_tail+0x1d/0xdb [<c11c55a2>] i915_gem_free_object+0x3d/0x41 [<c11a6be2>] ? drm_gem_object_free+0x0/0x27 [<c11a6c07>] drm_gem_object_free+0x25/0x27 [<c113c3ca>] kref_put+0x39/0x42 [<c11c0a59>] drm_gem_object_unreference+0x16/0x18 [<c11c0b15>] i915_gem_object_move_to_inactive+0xba/0xbe [<c11c0c87>] i915_gem_retire_requests_ring+0x16e/0x1a5 [<c11c3645>] i915_gem_retire_requests+0x48/0x63 [<c11c36ac>] i915_gem_retire_work_handler+0x4c/0x117 [<c10385d1>] process_one_work+0x140/0x21b [<c103734c>] ? __need_more_worker+0x13/0x2a [<c10373b1>] ? need_to_create_worker+0x1c/0x35 [<c11c3660>] ? i915_gem_retire_work_handler+0x0/0x117 [<c1038faf>] worker_thread+0xd4/0x14b [<c1038edb>] ? worker_thread+0x0/0x14b [<c103be1b>] kthread+0x68/0x6d [<c103bdb3>] ? kthread+0x0/0x6d [<c12970f6>] kernel_thread_helper+0x6/0x10 Code: 00 e8 98 fe ff ff 5d c3 55 89 e5 3e 8d 74 26 00 ba 01 00 00 00 e8 84 fe ff ff 5d c3 55 89 e5 53 8d 64 24 fc 3e 8d 74 26 00 89 c3 <f0> ff 08 79 05 e8 ab ff ff ff 89 e0 25 00 e0 ff ff 89 43 10 58 EIP: [<c1295a20>] mutex_lock+0xf/0x27 SS:ESP 0068:f706fe18 CR2: 0000000000000054 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Keith Packard <keithp@keithp.com>
* drm/i915: Fix use after free within tracepointChris Wilson2011-03-231-2/+2
| | | | | | | Detected by scripts/coccinelle/free/kfree.cocci. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Keith Packard <keithp@keithp.com>
* drm/i915: Restore missing command flush before interrupt on BLT ringChris Wilson2011-03-231-1/+6
| | | | | | | | | | | | | | | | | | | | | | We always skipped flushing the BLT ring if the request flush did not include the RENDER domain. However, this neglects that we try to flush the COMMAND domain after every batch and before the breadcrumb interrupt (to make sure the batch is indeed completed prior to the interrupt firing and so insuring CPU coherency). As a result of the missing flush, incoherency did indeed creep in, most notable when using lots of command buffers and so potentially rewritting an active command buffer (i.e. the GPU was still executing from it even though the following interrupt had already fired and the request/buffer retired). As all ring->flush routines now have the same preconditions, de-duplicate and move those checks up into i915_gem_flush_ring(). Fixes gem_linear_blit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35284 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: mengmeng.meng@intel.com
* drm/i915: Fix computation of pitch for dumb bo creatorChris Wilson2011-03-231-1/+1
| | | | | Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Fix tiling corruption from pipelined fencingChris Wilson2011-03-231-27/+17
| | | | | | | | | | | | | | | | | | | ... even though it was disabled. A mistake in the handling of fence reuse caused us to skip the vital delay of waiting for the object to finish rendering before changing the register. This resulted in us changing the fence register whilst the bo was active and so causing the blits to complete using the wrong stride or even the wrong tiling. (Visually the effect is that small blocks of the screen look like they have been interlaced). The fix is to wait for the GPU to finish using the memory region pointed to by the fence before changing it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34584 Cc: Andy Whitcroft <apw@canonical.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> [Note for 2.6.38-stable, we need to reintroduce the interruptible passing] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Dave Airlie <airlied@linux.ie>
* drm/i915: Prevent racy removal of request from client listHerton Ronaldo Krzesinski2011-03-231-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When i915_gem_retire_requests_ring calls i915_gem_request_remove_from_client, the client_list for that request may already be removed in i915_gem_release. So we may call twice list_del(&request->client_list), resulting in an oops like this report: [126167.230394] BUG: unable to handle kernel paging request at 00100104 [126167.230699] IP: [<f8c2ce44>] i915_gem_retire_requests_ring+0xd4/0x240 [i915] [126167.231042] *pdpt = 00000000314c1001 *pde = 0000000000000000 [126167.231314] Oops: 0002 [#1] SMP [126167.231471] last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0C0A:00/power_supply/BAT1/current_now [126167.231901] Modules linked in: snd_seq_dummy nls_utf8 isofs btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserfs cryptd aes_i586 aes_generic binfmt_misc vboxnetadp vboxnetflt vboxdrv parport_pc ppdev snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep arc4 snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq uvcvideo videodev snd_timer snd_seq_device joydev iwlagn iwlcore mac80211 snd cfg80211 soundcore i915 drm_kms_helper snd_page_alloc psmouse drm serio_raw i2c_algo_bit video lp parport usbhid hid sky2 sdhci_pci ahci sdhci libahci [126167.232018] [126167.232018] Pid: 1101, comm: Xorg Not tainted 2.6.38-6-generic-pae #34-Ubuntu Gateway MC7833U / [126167.232018] EIP: 0060:[<f8c2ce44>] EFLAGS: 00213246 CPU: 0 [126167.232018] EIP is at i915_gem_retire_requests_ring+0xd4/0x240 [i915] [126167.232018] EAX: 00200200 EBX: f1ac25b0 ECX: 00000040 EDX: 00100100 [126167.232018] ESI: f1a2801c EDI: e87fc060 EBP: ef4d7dd8 ESP: ef4d7db0 [126167.232018] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [126167.232018] Process Xorg (pid: 1101, ti=ef4d6000 task=f1ba6500 task.ti=ef4d6000) [126167.232018] Stack: [126167.232018] f1a28000 f1a2809c f1a28094 0058bd97 f1aa2400 f1a2801c 0058bd7b 0058bd85 [126167.232018] f1a2801c f1a28000 ef4d7e38 f8c2e995 ef4d7e30 ef4d7e60 c14d1ebc f6b3a040 [126167.232018] f1522cc0 000000db 00000000 f1ba6500 ffffffa1 00000000 00000001 f1a29214 [126167.232018] Call Trace: Unfortunately the call trace reported was cut, but looking at debug symbols the crash is at __list_del, when probably list_del is called twice on the same request->client_list, as the dereferenced value is LIST_POISON1 + 4, and by looking more at the debug symbols before list_del call it should have being called by i915_gem_request_remove_from_client And as I can see in the code, it seems we indeed have the possibility to remove a request->client_list twice, which would cause the above, because we do list_del(&request->client_list) on both i915_gem_request_remove_from_client and i915_gem_release As Chris Wilson pointed out, it's indeed the case: "(...) I had thought that the actual insertion/deletion was serialised under the struct mutex and the intention of the spinlock was to protect the unlocked list traversal during throttling. However, I missed that i915_gem_release() is also called without struct mutex and so we do need the double check for i915_gem_request_remove_from_client()." This change does the required check to avoid the duplicate remove of request->client_list. Bugzilla: http://bugs.launchpad.net/bugs/733780 Cc: stable@kernel.org # 2.6.38 Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* Merge remote branch 'intel/drm-intel-next' of ../drm-next into drm-core-nextDave Airlie2011-03-141-157/+148
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'intel/drm-intel-next' of ../drm-next: (755 commits) drm/i915: Only wait on a pending flip if we intend to write to the buffer drm/i915/dp: Sanity check eDP existence drm/i915: Rebind the buffer if its alignment constraints changes with tiling drm/i915: Disable GPU semaphores by default drm/i915: Do not overflow the MMADDR write FIFO Revert "drm/i915: fix corruptions on i8xx due to relaxed fencing" drm/i915: Don't save/restore hardware status page address register drm/i915: don't store the reg value for HWS_PGA drm/i915: fix memory corruption with GM965 and >4GB RAM Linux 2.6.38-rc7 Revert "TPM: Long default timeout fix" drm/i915: Re-enable GPU semaphores for SandyBridge mobile drm/i915: Replace vblank PM QoS with "Interrupt-Based AGPBUSY#" Revert "drm/i915: Use PM QoS to prevent C-State starvation of gen3 GPU" drm/i915: Allow relocation deltas outside of target bo drm/i915: Silence an innocuous compiler warning for an unused variable fs/block_dev.c: fix new kernel-doc warning ACPI: Fix build for CONFIG_NET unset mm: <asm-generic/pgtable.h> must include <linux/mm_types.h> x86: Use u32 instead of long to set reset vector back to 0 ... Conflicts: drivers/gpu/drm/i915/i915_gem.c
| * Merge branch 'drm-intel-fixes' into drm-intel-nextChris Wilson2011-03-071-1/+1
| |\ | | | | | | | | | | | | | | | | | | | | | | | | Apply the trivial conflicting regression fixes, but keep GPU semaphores enabled. Conflicts: drivers/gpu/drm/i915/i915_drv.h drivers/gpu/drm/i915/i915_gem_execbuffer.c
| | * drm/i915: Rebind the buffer if its alignment constraints changes with tilingChris Wilson2011-03-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Early gen3 and gen2 chipset do not have the relaxed per-surface tiling constraints of the later chipsets, so we need to check that the GTT alignment is correct for the new tiling. If it is not, we need to rebind. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
| * | drm/i915: Use a device flag for non-interruptible phasesChris Wilson2011-02-221-36/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code paths for modesetting are growing in complexity as we may need to move the buffers around in order to fit the scanout in the aperture. Therefore we face a choice as to whether to thread the interruptible status through the entire pinning and unbinding code paths or to add a flag to the device when we may not be interrupted by a signal. This does the latter and so fixes a few instances of modesetting failures under stress. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
| * | drm/i915: Protect against drm_gem_object not being the first memberChris Wilson2011-02-221-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dave Airlie spotted that we had a potential bug should we ever rearrange the drm_i915_gem_object so not the base drm_gem_object was not its first member. He noticed that we often convert the return of drm_gem_object_lookup() immediately into drm_i915_gem_object and then check the result for nullity. This is only valid when the base object is the first member and so the superobject has the same address. Play safe instead and use the compiler to convert back to the original return address for sanity testing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
| * | drm/i915: i915_mutex_interruptible() returns -EINTRChris Wilson2011-02-111-0/+1
| | | | | | | | | | | | | | | | | | | | | ... so we handle that for i915_gem_fault() in the same manner as ERESTARTSYS, or we send a SIGBUS to the faulting application. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
| * | drm/i915: Skip the no-op domain changes when already in CPU|GTT domainsChris Wilson2011-02-071-0/+6
| | | | | | | | | | | | | | | | | | Removes some superfluous fluff from tracing... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
| * | drm/i915: Refine tracepointsChris Wilson2011-02-071-94/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | A lot of minor tweaks to fix the tracepoints, improve the outputting for ftrace, and to generally make the tracepoints useful again. It is a start and enough to begin identifying performance issues and gaps in our coverage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
| * | drm/i915: Fix infinite loop regression from 21dd3734Chris Wilson2011-02-071-5/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By returning EAGAIN upon a wedged GPU before attempting to wait, we would hit an infinite loop of repeating operation without ever progressing. Instead this needs to be EIO so that userspace knows that the GPU is truly wedged and not in the process of error recovery. Similarly, we need to handle the error recovery during i915_gem_fault. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
| * | drm/i915: Defer reporting EIO until we try to use the GPUChris Wilson2011-01-271-22/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of reporting EIO upfront in the entrance of an ioctl that may or may not attempt to use the GPU, defer the actual detection of an invalid ioctl to when we issue a GPU instruction. This allows us to continue to use bo in video memory (via pread/pwrite and mmap) after the GPU has hung. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
| * | drm/i915: Check wedged status before throttlingChris Wilson2011-01-271-0/+3
| | | | | | | | | | | | Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
| * | drm/i915: Silence a few -Wunused-but-set-variableChris Wilson2011-01-251-3/+0
| |/ | | | | | | Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* / drm: dumb scanout create/mmap for intel/radeon (v3)Dave Airlie2011-02-071-30/+73
|/ | | | | | | | | | | | | | | | | | | This is just an idea that might or might not be a good idea, it basically adds two ioctls to create a dumb and map a dumb buffer suitable for scanout. The handle can be passed to the KMS ioctls to create a framebuffer. It looks to me like it would be useful in the following cases: a) in development drivers - we can always provide a shadowfb fallback. b) libkms users - we can clean up libkms a lot and avoid linking to libdrm_*. c) plymouth via libkms is a lot easier. Userspace bits would be just calls + mmaps. We could probably mark these handles somehow as not being suitable for acceleartion so as top stop people who are dumber than dumb. Signed-off-by: Dave Airlie <airlied@redhat.com>
* drm/i915,agp/intel: Do not clear stolen entriesChris Wilson2011-01-241-3/+7
| | | | | | | | | | | | | We can only utilize the stolen portion of the GTT if we are in sole charge of the hardware. This is only true if using GEM and KMS, otherwise VESA continues to access stolen memory. Reported-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Fix use of invalid array size for ring->sync_seqnoChris Wilson2011-01-231-1/+1
| | | | | | | | | There are I915_NUM_RINGS-1 inter-ring synchronisation counters, but we were clearing I915_NUM_RINGS of them. Oops. Reported-by: Jiri Slaby <jirislaby@gmail.com> Tested-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: If we hit OOM when allocating GTT pages, clear the apertureChris Wilson2011-01-111-8/+4
| | | | | | | | | | Rather than evicting an object at random, which is unlikely to alleviate the memory pressure sufficient to allow us to continue, zap the entire aperture. That should give the system long enough to recover and reap some pages from the evicted objects, forestalling the allocation error for the new object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Periodically flush the active lists and requestsChris Wilson2011-01-111-4/+26
| | | | | | | | | | | In order to retire active buffers whilst no client is active, we need to insert our own flush requests onto the ring. This is useful for servers that queue up some rendering and then go to sleep as it allows us to the complete processing of those requests, potentially making that memory available again much earlier. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Propagate error from flushing the ringChris Wilson2011-01-111-34/+68
| | | | | | ... in order to avoid a BUG() and potential unbounded waits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Handle ringbuffer stalls when flushingChris Wilson2011-01-111-2/+2
| | | | Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Enforce write ordering through the GTTChris Wilson2011-01-111-1/+13
| | | | | | | | | | We need to ensure that writes through the GTT land before any modification to the MMIO registers and so must impose a mandatory write barrier when flushing the GTT domain. This was revealed by relaxing the write ordering by experimentally mapping the registers and the GATT as write-combining. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Allow the application to choose the constant addressing modeChris Wilson2010-12-201-0/+2
| | | | | | | | | | The relative-to-general state default is useless as it means having to rewrite the streaming kernels for each batch. Relative-to-surface is more useful, as that stream usually needs to be rewritten for each batch. And absolute addressing mode, vital if you start streaming state, is also only available by adjusting the register... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Poll for seqno completion if IRQ is disabledChris Wilson2010-12-141-2/+4
| | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32288 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915/ringbuffer: Make IRQ refcnting atomicChris Wilson2010-12-141-17/+20
| | | | | | | In order to enforce the correct memory barriers for irq get/put, we need to perform the actual counting using atomic operations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* Merge branch 'drm-intel-fixes' into drm-intel-nextChris Wilson2010-12-071-1/+16
|\ | | | | | | | | | | Conflicts: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/intel_dp.c
| * drm/i915: Emit a request to clear a flushed and idle ring for unbusy boChris Wilson2010-12-071-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | In order for bos to retire eventually, a request must be sent down the ring. This is expected, for example, by occlusion queries for which mesa will wait upon (whilst running glean) before issuing more batches and so the normal activity upon the ring is suspended and we need to emit a request to clear the idle ring. Reported-by: Jinjin, Wang <jinjin.wang@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30380 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Wait for the bo if a display flip is pipelined on the other ringChris Wilson2010-12-061-1/+1
| | | | | | | | Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Only emit a flush if there is an outstanding gpu writeChris Wilson2010-12-061-2/+3
| | | | | | | | Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Completely disable fence pipelining.Chris Wilson2010-12-051-2/+4
| | | | | | | | | | | | | | I'm still seeing tiling corruption of PutImage and CopyArea (I think) under mutter on pnv, so obviously the pipelining logic is deeply flawed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Implement GPU semaphores for inter-ring synchronisation on SNBChris Wilson2010-12-051-44/+42
| | | | | | | | | | | | | | | | The bulk of the change is to convert the growing list of rings into an array so that the relationship between the rings and the semaphore sync registers can be easily computed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Kill the get_fence tracepointChris Wilson2010-12-021-3/+0
| | | | | | | | | | | | | | | | | | As the tracepoint is now decoupled from when the actual register is assigned and was never complemented by detailing when the object lost its fence, it has outlived its limited usefulness. Profiling the actual stalls is a far more profitable venture anyway. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Remove inactive LRU tracking from set_domain_ioctlChris Wilson2010-12-021-17/+0
| | | | | | | | | | | | | | | | | | As the userspace mappings are torn down on every GPU write, we prefer to track when the buffer is activated (via a fresh i915_gem_fault). This makes the LRU conceptually simpler. With coherent mappings, the remaining use-case for set_domain_ioctl is GPU synchronisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Pipelined fencing [infrastructure]Chris Wilson2010-12-021-138/+222
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this change, every batchbuffer can use all available fences (save pinned and scanout, of course) without ever stalling the gpu! In theory. Currently the actual pipelined update of the register is disabled due to some stability issues. However, just the deferred update is a significant win. Based on a series of patches by Daniel Vetter. The premise is that before every access to a buffer through the GTT we have to declare whether we need a register or not. If the access is by the GPU, a pipelined update to the register is made via the ringbuffer, and we track the last seqno of the batches that access it. If by the CPU we wait for the last GPU access and update the register (either to clear or to set it for the current buffer). One advantage of being able to pipeline changes is that we can defer the actual updating of the fence register until we first need to access the object through the GTT, i.e. we can eliminate the stall on set_tiling. This is important as the userspace bo cache does not track the tiling status of active buffers which generate frequent stalls on gen3 when enabling tiling for an already bound buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: Prevent stalling for a GTT read back from a read-only GPU targetChris Wilson2010-12-021-3/+6
| | | | | | | | Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Release fenced GTT mapping on suspendChris Wilson2010-11-281-2/+9
| | | | | | | | | | | | | | ... so that upon first use after resume we will reacquire the fence reg. Reported-by: Keith Packard <keithp@keithp.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | Merge branch 'drm-intel-fixes' into drm-intel-nextChris Wilson2010-11-281-36/+16
|\| | | | | | | | | Conflicts: drivers/gpu/drm/i915/i915_gem.c
| * drm/i915: fix regression due to ba3d8d749b01548b9Daniel Vetter2010-11-281-25/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't track gpu flush request in any special way. So even with obj->write_domain == 0, a gpu flush might be outstanding but no yet executed. Even worse, the latest request might use the object only for reading. So and unconditional call to object_wait_rendering is needed for !pipelined. Hence revert that patch fully and untangle the flushing from the synchronization again. Reported-by: Keith Packard <keithp@keithp.com> Tested-by: Keith Packard <keithp@keithp.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Avoid allocation for execbuffer object listChris Wilson2010-11-251-0/+1
| | | | | | | | | | | | | | Besides the minimal improvement in reducing the execbuffer overhead, the real benefit is clarifying a few routines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Split i915_gem_execbuffer into its own file.Chris Wilson2010-11-251-1151/+13
| | | | | | | | | | | | | | | | A number of dragons have been seen lurking within the execbuffer code. The first step is then to isolate them from the rest and begin to scrutinise them in depth. Suggested by Daniel Vetter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Defer accounting until read from debugfsChris Wilson2010-11-251-106/+22
| | | | | | | | | | | | | | | | | | | | Simply remove our accounting of objects inside the aperture, keeping only track of what is in the aperture and its current usage. This removes the over-complication of BUGs that were attempting to keep the accounting correct and also removes the overhead of the accounting on the hot-paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Mark a few functions as __must_checkChris Wilson2010-11-251-21/+16
| | | | | | | | | | | | | | ... to benefit from the compiler checking that we remember to handle and propagate errors. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Only save and restore fences for UMSChris Wilson2010-11-251-10/+15
| | | | | | | | | | | | | | With KMS, we can simply relinquish the fence when we idle the GPU and reassign it upon first use. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Add a mechanism for pipelining fence register updatesDaniel Vetter2010-11-251-42/+91
| | | | | | | | | | | | | | Not employed just yet... Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: More accurately track last fence usage by the GPUChris Wilson2010-11-241-46/+64
| | | | | | | | | | | | Based on a patch by Daniel Vetter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* | drm/i915: Rework execbuffer pinningChris Wilson2010-11-241-23/+58
| | | | | | | | | | | | | | | | Avoid evicting buffers that will be used later in the batch in order to make room for the initial buffers by pinning all bound buffers in a single pass before binding (and evicting for) fresh buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>