summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/i915_gpu_error.c
Commit message (Collapse)AuthorAgeFilesLines
* drm/i915/csr: move CSR version macros to intel_csr.hJani Nikula2019-05-031-0/+1
| | | | | | | | Reduce clutter from i915_drv.h. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/8222df3f559b056387b5c7e6e04a878cbf8b4e2e.1556809195.git.jani.nikula@intel.com
* drm/i915: extract intel_atomic.h from intel_drv.hJani Nikula2019-04-301-0/+1
| | | | | | | | | | | | | | | | | | | | It used to be handy that we only had a couple of headers, but over time intel_drv.h has become unwieldy. Extract declarations to a separate header file corresponding to the implementation module, clarifying the modularity of the driver. Ensure the new header is self-contained, and do so with minimal further includes, using forward declarations as needed. Include the new header only where needed, and sort the modified include directives while at it and as needed. No functional changes. v2: fix sparse warnings on undeclared global functions Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190429125331.32499-1-jani.nikula@intel.com
* drm/i915: extract intel_overlay.h from intel_drv.h and i915_drv.hJani Nikula2019-04-301-1/+2
| | | | | | | | | | | | | | | | | | It used to be handy that we only had a couple of headers, but over time intel_drv.h has become unwieldy. Extract declarations to a separate header file corresponding to the implementation module, clarifying the modularity of the driver. Ensure the new header is self-contained, and do so with minimal further includes, using forward declarations as needed. Include the new header only where needed, and sort the modified include directives while at it and as needed. No functional changes. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/2e4fb1e67ed38870df3040bb0a1b1a58fd90cc86.1556540890.git.jani.nikula@intel.com
* drm/i915: add GEN2_ prefix to the I{E, I, M, S}R registersPaulo Zanoni2019-04-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | This discussion started because we use token pasting in the GEN{2,3}_IRQ_INIT and GEN{2,3}_IRQ_RESET macros, so gen2-4 passes an empty argument to those macros, making the code a little weird. The original proposal was to just add a comment as the empty argument, but Ville suggested we just add a prefix to the registers, and that indeed sounds like a more elegant solution. Now doing this is kinda against our rules for register naming since we only add gens or platform names as register prefixes when the given gen/platform changes a register that already existed before. On the other hand, we have so many instances of IIR/IMR in comments that adding a prefix would make the users of these register more easily findable, in addition to make our token pasting macros actually readable. So IMHO opening an exception here is worth it. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190410235344.31199-4-paulo.r.zanoni@intel.com
* drm/i915: Make RING_PDP relative to engine->mmio_baseChris Wilson2019-04-051-6/+9
| | | | | | | | | | | | | | | | | | | The PDP registers are an oddity inside the set of context saved registers in that they take the engine as a parameter to the macro and not the mmio_base as the others do. Make it accept the engine->mmio_base for consistency in programming the context registers. add/remove: 0/0 grow/shrink: 2/1 up/down: 3/-32 (-29) Function old new delta emit_ppgtt_update 324 326 +2 capture 5102 5103 +1 execlists_init_reg_state.isra 1128 1096 -32 And similar savings later! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190405123831.9724-1-chris@chris-wilson.co.uk
* drm/i915: Move intel_engine_mask_t around for use by i915_request_types.hChris Wilson2019-04-021-4/+5
| | | | | | | | | | | | | | | | | | We want to use intel_engine_mask_t inside i915_request.h, which means extracting it from the general header file mess and placing it inside a types.h. A knock on effect is that the compiler wants to warn about type-contraction of ALL_ENGINES into intel_engine_maskt_t, so prepare for the worst. v2: Use intel_engine_mask_t consistently v3: Move I915_NUM_ENGINES to its natural home at the end of the enum Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190401162641.10963-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
* drm/i915: Introduce concept of a sub-platformTvrtko Ursulin2019-04-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Concept of a sub-platform already exist in our code (like ULX and ULT platform variants and similar),implemented via the macros which check a list of device ids to determine a match. With this patch we consolidate device ids checking into a single function called during early driver load. A few low bits in the platform mask are reserved for sub-platform identification and defined as a per-platform namespace. At the same time it future proofs the platform_mask handling by preparing the code for easy extending, and tidies the very verbose WARN strings generated when IS_PLATFORM macros are embedded into a WARN type statements. v2: Fixed IS_SUBPLATFORM. Updated commit msg. v3: Chris was right, there is an ordering problem. v4: * Catch-up with new sub-platforms. * Rebase for RUNTIME_INFO. * Drop subplatform mask union tricks and convert platform_mask to an array for extensibility. v5: * Fix subplatform check. * Protect against forgetting to expand subplatform bits. * Remove platform enum tallying. * Add subplatform to error state. (Chris) * Drop macros and just use static inlines. * Remove redundant IRONLAKE_M. (Ville) v6: * Split out Ironlake change. * Optimize subplatform check. * Use __always_inline. (Lucas) * Add platform_mask comment. (Paulo) * Pass stored runtime info in error capture. (Chris) v7: * Rebased for new AML ULX device id. * Bump platform mask array size for EHL. * Stop mentioning device ids in intel_device_subplatform_init by using the trick of splitting macros i915_pciids.h. (Jani) * AML seems to be either a subplatform of KBL or CFL so express it like that. v8: * Use one device id table per subplatform. (Jani) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Jose Souza <jose.souza@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190327142328.31780-1-tvrtko.ursulin@linux.intel.com
* drm/i915: take a reference to uncore in the engine and use itDaniele Ceraolo Spurio2019-03-261-21/+21
| | | | | | | | | | | | | | | | | | A few advantages: - Prepares us for the planned split of display uncore from GT uncore - Improves our engine-centric view of the world in the engine code and allows us to avoid jumping back to dev_priv. - Allows us to wrap accesses to engine register in nice macros that automatically pick the right mmio base. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190325214940.23632-10-daniele.ceraolospurio@intel.com
* drm/i915: Stop storing ctx->user_handleChris Wilson2019-03-211-7/+4
| | | | | | | | | The user_handle need only be known by userspace for it to lookup the context via the idr; internally we have no use for it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190321140711.11190-3-chris@chris-wilson.co.uk
* drm/i915: Fix off-by-one in reporting hanging processChris Wilson2019-03-151-1/+1
| | | | | | | | | | | ffs() is 1-indexed, but we want to use it as an index into an array, so use __ffs() instead. Fixes: eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190315163933.19352-1-chris@chris-wilson.co.uk
* drm/i915: Move find_active_request() to the engineChris Wilson2019-03-051-1/+1
| | | | | | | | | | To find the active request, we need only search along the individual engine for the right request. This does not require touching any global GEM state, so move it into the engine compartment. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-3-chris@chris-wilson.co.uk
* drm/i915: Store the BIT(engine->id) as the engine's maskChris Wilson2019-03-051-5/+6
| | | | | | | | | | | | | | | | | | | In the next patch, we are introducing a broad virtual engine to encompass multiple physical engines, losing the 1:1 nature of BIT(engine->id). To reflect the broader set of engines implied by the virtual instance, lets store the full bitmask. v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/) v3: Tvrtko voted for moah churn so teach everyone to not mention ring and use $class$instance throughout. v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and VCS[0-4] in later gen. We opt to keep the code consistent and use 0-index naming throughout. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk
* drm/i915: Stop capturing semaphore registers for gen6/7 GPU hangsChris Wilson2019-03-051-23/+2
| | | | | | | | | | | We no longer use the semaphore sync registers on gen6/7, so including them in the GPU error state is mere noise. References: 6faf5916e6be ("drm/i915: Remove HW semaphores for gen7 inter-engine synchronisation") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305150914.11340-2-chris@chris-wilson.co.uk
* drm/i915: Remove i915_request.global_seqnoChris Wilson2019-02-261-30/+4
| | | | | | | | | | | Having weaned the interrupt handling off using a single global execution queue, we no longer need to emit a global_seqno. Note that we still have a few assumptions about execution order along engine timelines, but this removes the most obvious artefact! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-3-chris@chris-wilson.co.uk
* drm/i915: Remove access to global seqno in the HWSPChris Wilson2019-02-261-4/+0
| | | | | | | | | Stop accessing the HWSP to read the global seqno, and stop tracking the mirror in the engine's execution timeline -- it is unused. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-2-chris@chris-wilson.co.uk
* drm/i915: Use time based guilty context banningChris Wilson2019-02-191-23/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we accumulate each time a context hangs the GPU, offset against the number of requests it submits, and if that score exceeds a certain threshold, we ban that context from submitting any more requests (cancelling any work in flight). In contrast, we use a simple timer on the file, that if we see more than a 9 hangs faster than 60s apart in total across all of its contexts, we will ban the client from creating any more contexts. This leads to a confusing situation where the file may be banned before the context, so lets use a simple timer scheme for each. If the context submits 3 hanging requests within a 120s period, declare it forbidden to ever send more requests. This has the advantage of not being easy to repair by simply sending empty requests, but has the disadvantage that if the context is idle then it is forgiven. However, if the context is idle, it is not disrupting the system, but a hog can evade the request counting and cause much more severe disruption to the system. Updating ban_score from request retirement is dubious as the retirement is purposely not in sync with request submission (i.e. we try and batch retirement to reduce overhead and avoid latency on submission), which leads to surprising situations where we can forgive a hang immediately due to a backlog of requests from before the hang being retired afterwards. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190219122215.8941-2-chris@chris-wilson.co.uk
* drm/i915: Pull i915_gem_active into the i915_active familyChris Wilson2019-02-051-5/+5
| | | | | | | | | | | | | | | | | | | Looking forward, we need to break the struct_mutex dependency on i915_gem_active. In the meantime, external use of i915_gem_active is quite beguiling, little do new users suspect that it implies a barrier as each request it tracks must be ordered wrt the previous one. As one of many, it can be used to track activity across multiple timelines, a shared fence, which fits our unordered request submission much better. We need to steer external users away from the singular, exclusive fence imposed by i915_gem_active to i915_active instead. As part of that process, we move i915_gem_active out of i915_request.c into i915_active.c to start separating the two concepts, and rename it to i915_active_request (both to tie it to the concept of tracking just one request, and to give it a longer, less appealing name). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk
* drm/i915: Drop fake breadcrumb irqChris Wilson2019-01-291-2/+0
| | | | | | | | | | | | | | Missed breadcrumb detection is defunct due to the tight coupling with dma_fence signaling and the myriad ways we may signal fences from everywhere but from an interrupt, i.e. we frequently signal a fence before we even see its interrupt. This means that even if we miss an interrupt for a fence, it still is signaled before our breadcrumb hangcheck fires, so simplify the breadcrumb hangchecking by moving it into the GPU hangcheck and forgo fake interrupts. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-3-chris@chris-wilson.co.uk
* drm/i915: Replace global breadcrumbs with per-context interrupt trackingChris Wilson2019-01-291-75/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A few years ago, see commit 688e6c725816 ("drm/i915: Slaughter the thundering i915_wait_request herd"), the issue of handling multiple clients waiting in parallel was brought to our attention. The requirement was that every client should be woken immediately upon its request being signaled, without incurring any cpu overhead. To handle certain fragility of our hw meant that we could not do a simple check inside the irq handler (some generations required almost unbounded delays before we could be sure of seqno coherency) and so request completion checking required delegation. Before commit 688e6c725816, the solution was simple. Every client waiting on a request would be woken on every interrupt and each would do a heavyweight check to see if their request was complete. Commit 688e6c725816 introduced an rbtree so that only the earliest waiter on the global timeline would woken, and would wake the next and so on. (Along with various complications to handle requests being reordered along the global timeline, and also a requirement for kthread to provide a delegate for fence signaling that had no process context.) The global rbtree depends on knowing the execution timeline (and global seqno). Without knowing that order, we must instead check all contexts queued to the HW to see which may have advanced. We trim that list by only checking queued contexts that are being waited on, but still we keep a list of all active contexts and their active signalers that we inspect from inside the irq handler. By moving the waiters onto the fence signal list, we can combine the client wakeup with the dma_fence signaling (a dramatic reduction in complexity, but does require the HW being coherent, the seqno must be visible from the cpu before the interrupt is raised - we keep a timer backup just in case). Having previously fixed all the issues with irq-seqno serialisation (by inserting delays onto the GPU after each request instead of random delays on the CPU after each interrupt), we can rely on the seqno state to perfom direct wakeups from the interrupt handler. This allows us to preserve our single context switch behaviour of the current routine, with the only downside that we lose the RT priority sorting of wakeups. In general, direct wakeup latency of multiple clients is about the same (about 10% better in most cases) with a reduction in total CPU time spent in the waiter (about 20-50% depending on gen). Average herd behaviour is improved, but at the cost of not delegating wakeups on task_prio. v2: Capture fence signaling state for error state and add comments to warm even the most cold of hearts. v3: Check if the request is still active before busywaiting v4: Reduce the amount of pointer misdirection with list_for_each_safe and using a local i915_request variable inside the loops v5: Add a missing pluralisation to a purely informative selftest message. References: 688e6c725816 ("drm/i915: Slaughter the thundering i915_wait_request herd") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-2-chris@chris-wilson.co.uk
* drm/i915: Stop tracking MRU activity on VMAChris Wilson2019-01-281-21/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our goal is to remove struct_mutex and replace it with fine grained locking. One of the thorny issues is our eviction logic for reclaiming space for an execbuffer (or GTT mmaping, among a few other examples). While eviction itself is easy to move under a per-VM mutex, performing the activity tracking is less agreeable. One solution is not to do any MRU tracking and do a simple coarse evaluation during eviction of active/inactive, with a loose temporal ordering of last insertion/evaluation. That keeps all the locking constrained to when we are manipulating the VM itself, neatly avoiding the tricky handling of possible recursive locking during execbuf and elsewhere. Note that discarding the MRU (currently implemented as a pair of lists, to avoid scanning the active list for a NONBLOCKING search) is unlikely to impact upon our efficiency to reclaim VM space (where we think a LRU model is best) as our current strategy is to use random idle replacement first before doing a search, and over time the use of softpinned 48b per-ppGTT is growing (thereby eliminating any need to perform any eviction searches, in theory at least) with the remaining users being found on much older devices (gen2-gen6). v2: Changelog and commentary rewritten to elaborate on the duality of a single list being both an inactive and active list. v3: Consolidate bool parameters into a single set of flags; don't comment on the duality of a single variable being a multiplicity of bits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk
* drm/i915: Remove GPU reset dependence on struct_mutexChris Wilson2019-01-251-55/+49
| | | | | | | | | | | | | | | | | Now that the submission backends are controlled via their own spinlocks, with a wave of a magic wand we can lift the struct_mutex requirement around GPU reset. That is we allow the submission frontend (userspace) to keep on submitting while we process the GPU reset as we can suspend the backend independently. The major change is around the backoff/handoff strategy for performing the reset. With no mutex deadlock, we no longer have to coordinate with any waiter, and just perform the reset immediately. Testcase: igt/gem_mmap_gtt/hang # regresses Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-3-chris@chris-wilson.co.uk
* drm/i915: small isolated c99 types to kernel types switchJani Nikula2019-01-171-5/+5
| | | | | | | | | | | | | | Mixed C99 and kernel types use is getting ugly. Prefer kernel types. sed -i 's/\buint\(8\|16\|32\|64\)_t\b/u\1/g' Minor checkpatch fixes sprinkled on top of the changed lines. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/14ed72e7f04c9340a057855c5950b54811f8a477.1547629303.git.jani.nikula@intel.com
* drm/i915: Guard error capture against unpinned vmaChris Wilson2019-01-101-1/+1
| | | | | | | | | | | | | | If we find an incompletely setup vma inside the request/engine at the time of a hang, it may not have vma->pages initialised, so skip capturing the object before we iterate over NULL. Spotted by Matthew in preparation for using unpinned vma to track engine state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190110111522.11023-1-chris@chris-wilson.co.uk
* drm/i915: Show machine type in error stateChris Wilson2019-01-031-1/+3
| | | | | | | | | | | As the question of 32b/64b kernels became relevant in the light of certain bugs, include that information in the error state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190103101245.15100-1-chris@chris-wilson.co.uk
* drm/i915: start moving runtime device info to a separate structJani Nikula2019-01-021-2/+7
| | | | | | | | | | | First move the low hanging fruit, the fields that are only initialized runtime. Use RUNTIME_INFO() exclusively to access the fields. Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c24fe7a4b0492a888690c46814c0ff21ce2f12b1.1546267488.git.jani.nikula@intel.com
* drm/i915: add a helper to free the members of i915_paramsJani Nikula2018-12-311-8/+1
| | | | | | | | | Abstract the one user in anticipation of more. Set the dangling pointers to NULL while at it. Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/8637d1e5049dc003718772f19d664aeaf9540856.1545920737.git.jani.nikula@intel.com
* drm/i915: add a helper to make a copy of i915_paramsJani Nikula2018-12-311-10/+1
| | | | | | | | Abstract the one user in anticipation of more. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c6a94b4da8dc723df025b1f602fe46d76d00d53f.1545920737.git.jani.nikula@intel.com
* drm/i915: merge gen checks to use rangeLucas De Marchi2018-12-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using IS_GEN() for consecutive gen checks, let's pass the range to IS_GEN_RANGE(). By code inspection these were the ranges deemed necessary for spatch: @@ expression e; @@ ( - IS_GEN(e, 3) || IS_GEN(e, 2) + IS_GEN_RANGE(e, 2, 3) | - IS_GEN(e, 3) || IS_GEN(e, 4) + IS_GEN_RANGE(e, 3, 4) | - IS_GEN(e, 5) || IS_GEN(e, 6) + IS_GEN_RANGE(e, 5, 6) | - IS_GEN(e, 6) || IS_GEN(e, 7) + IS_GEN_RANGE(e, 6, 7) | - IS_GEN(e, 7) || IS_GEN(e, 8) + IS_GEN_RANGE(e, 7, 8) | - IS_GEN(e, 8) || IS_GEN(e, 9) + IS_GEN_RANGE(e, 8, 9) | - IS_GEN(e, 10) || IS_GEN(e, 9) + IS_GEN_RANGE(e, 9, 10) | - IS_GEN(e, 9) || IS_GEN(e, 10) + IS_GEN_RANGE(e, 9, 10) ) After conversion, checking we don't have any missing IS_GEN_RANGE() || IS_GEN() was also done. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-3-lucas.demarchi@intel.com
* drm/i915: replace IS_GEN<N> with IS_GEN(..., N)Lucas De Marchi2018-12-121-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of gen_mask to do the comparison. Now callers can pass then gen as a parameter, so we don't require one macro for each gen. The following spatch was used to convert the users of these macros: @@ expression e; @@ ( - IS_GEN2(e) + IS_GEN(e, 2) | - IS_GEN3(e) + IS_GEN(e, 3) | - IS_GEN4(e) + IS_GEN(e, 4) | - IS_GEN5(e) + IS_GEN(e, 5) | - IS_GEN6(e) + IS_GEN(e, 6) | - IS_GEN7(e) + IS_GEN(e, 7) | - IS_GEN8(e) + IS_GEN(e, 8) | - IS_GEN9(e) + IS_GEN(e, 9) | - IS_GEN10(e) + IS_GEN(e, 10) | - IS_GEN11(e) + IS_GEN(e, 11) ) v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than using the bitmask Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com
* drm/i915: Skip the ERR_PTR error stateChris Wilson2018-12-071-9/+14
| | | | | | | | | | | | | | | | | | | Although commit fb6f0b64e455 ("drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture") applied cleanly after a 24 month hiatus, the code had moved on with new methods for peeking and fetching the captured gpu info. Make sure we catch all uses of the stashed error state and avoid dereferencing the error pointer. v2: Move error pointer determination into i915_gpu_capture_state v3: Restore early check to avoid capturing and then throwing away subsequent GPU error states. Fixes: fb6f0b64e455 ("drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181207110554.19897-1-chris@chris-wilson.co.uk
* drm/i915: Allocate a common scratch pageChris Wilson2018-12-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Currently we allocate a scratch page for each engine, but since we only ever write into it for post-sync operations, it is not exposed to userspace nor do we care for coherency. As we then do not care about its contents, we can use one page for all, reducing our allocations and avoid complications by not assuming per-engine isolation. For later use, it simplifies engine initialisation (by removing the allocation that required struct_mutex!) and means that we can always rely on there being a scratch page. v2: Check that we allocated a large enough scratch for I830 w/a Fixes: 06e562e7f515 ("drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5") # v4.18.20 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108850 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181204141522.13640-1-chris@chris-wilson.co.uk Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.18.20+
* drm/i915: Cache the error stringChris Wilson2018-11-231-129/+206
| | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we convert the error state into a string every time we read from sysfs (and sysfs reads in page size (4KiB) chunks). We do try to window the string and only capture the portion that is being read, but that means that we must always convert up to the window to find the start. For a very large error state bordering on EXEC_OBJECT_CAPTURE abuse, this is noticeable as it degrades to O(N^2)! As we do not have a convenient hook for sysfs open(), and we would like to keep the lazy conversion into a string, do the conversion of the whole string on the first read and keep the string until the error state is freed. v2: Don't double advance simple_read_from_buffer v3: Due to extreme pain of lack of vrealloc, use a scatterlist v4: Keep the forward iterator loosely cached v5: Stylistic improvements to reduce patch size Reported-by: Jason Ekstrand <jason@jlekstrand.net> Testcase: igt/gem_exec_capture/many* Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181123132325.26541-1-chris@chris-wilson.co.uk
* drm/i915: avoid rebuilding i915_gpu_error.o on version string updatesHans Holmberg2018-11-221-2/+2
| | | | | | | | | | | There is no need to rebuild i915_gpu_error.o when the version string changes as the version is available in init_utsname()->release. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181121095423.20760-1-hans.ml.holmberg@owltronix.com
* drm/i915: Prevent machine hang from Broxton's vtd w/a and error captureChris Wilson2018-11-191-1/+14
| | | | | | | | | | | | | | | | | | | | | Since capturing the error state requires fiddling around with the GGTT to read arbitrary buffers and is itself run under stop_machine(), it deadlocks the machine (effectively a hard hang) when run in conjunction with Broxton's VTd workaround to serialize GGTT access. v2: Store the ERR_PTR in first_error so that the error can be reported to the user via sysfs. v3: Mention the quirk in dmesg (using info as per usual) Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: John Harrison <john.C.Harrison@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181102161232.17742-5-chris@chris-wilson.co.uk
* drm/i915: Clear the error PTE just once on finishChris Wilson2018-10-031-1/+9
| | | | | | | | | | We do not need to continually clear our dedicated PTE for error capture as it will be updated and invalidated to the next object. Only at the end do we wish to be sure that the PTE doesn't point back to any buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181001194447.29910-3-chris@chris-wilson.co.uk
* drm/i915: Handle incomplete Z_FINISH for compressed error statesChris Wilson2018-10-031-25/+63
| | | | | | | | | | | | | | | | | | | | The final call to zlib_deflate(Z_FINISH) may require more output space to be allocated and so needs to re-invoked. Failure to do so in the current code leads to incomplete zlib streams (albeit intact due to the use of Z_SYNC_FLUSH) resulting in the occasional short object capture. v2: Check against overrunning our pre-allocated page array v3: Drop Z_SYNC_FLUSH entirely Testcase: igt/i915-error-capture.js Fixes: 0a97015d45ee ("drm/i915: Compress GPU objects in error state") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.10+ Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181003082422.23214-1-chris@chris-wilson.co.uk
* drm/i915: Remove i915.enable_ppgtt overrideChris Wilson2018-09-271-2/+2
| | | | | | | | | | | | | | | | | | | Now that we are confident in providing full-ppgtt where supported, remove the ability to override the context isolation. v2: Remove faked aliasing-ppgtt for testing as it no longer is accepted. v3: s/USES/HAS/ to match usage and reject attempts to load the module on old GVT-g setups that do not provide support for full-ppgtt. v4: Insulate ABI ppGTT values from our internal enum (later plans involve moving ppGTT depth out of the enum, thus potentially breaking ABI unless we document the current values). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Zhi Wang <zhi.a.wang@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180926201222.5643-1-chris@chris-wilson.co.uk
* Merge drm/drm-next into drm-intel-next-queuedJani Nikula2018-09-261-30/+4
|\ | | | | | | | | | | | | Catch up in general, and get DP_EXTENDED_RECEIVER_CAP_FIELD_PRESENT specifically. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| * include: Move ascii85 functions from i915 to linux/ascii85.hJordan Crouse2018-07-301-30/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The i915 DRM driver very cleverly used ascii85 encoding for their GPU state file. Move the encode functions to a general header file to support other drivers that might be interested in the same functionality. v4: Make the return value const char * as suggested by Chris Wilson v3: Fix error_puts -> err_puts pointed out by the 01.org bot v2: Update API to be cleaner for the caller as suggested by Chris Wilson Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* | drm/i915: Limit number of capture objectsChris Wilson2018-09-141-7/+13
|/ | | | | | | | | | If we fail to allocate an array for a large number of user requested capture objects, reduce the array size and try to grab at least some of the objects! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180911115810.8917-3-chris@chris-wilson.co.uk
* drm/i915: Track vma activity per fence.context, not per engineChris Wilson2018-07-061-11/+3
| | | | | | | | | | | | | | | | | | | In the next patch, we will want to be able to use more flexible request timelines that can hop between engines. From the vma pov, we can then not rely on the binding of this request to an engine and so can not ensure that different requests are ordered through a per-engine timeline, and so we must track activity of all timelines. (We track activity on the vma itself to prevent unbinding from HW before the HW has finished accessing it.) v2: Switch to a rbtree for 32b safety (since using u64 as a radixtree index is fraught with aliasing of unsigned longs). v3: s/lookup_active/active_instance/ because we can never agree on names Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-5-chris@chris-wilson.co.uk
* drm/i915: Prepare for non-object vmaChris Wilson2018-06-071-0/+3
| | | | | | | | | | | | | | | | | | | In order to allow ourselves to use VMA to wrap other entities other than GEM objects, we need to allow for the vma->obj backpointer to be NULL. In most cases, we know we are operating on a GEM object and its vma, but we need the core code (such as i915_vma_pin/insert/bind/unbind) to work regardless of the innards. The remaining eyesore here is vma->obj->cache_level and related (but less of an issue) vma->obj->gt_ro. With a bit of care we should mirror those on the vma itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180607154047.9171-1-chris@chris-wilson.co.uk
* drm/i915/gtt: Rename i915_hw_ppgtt base memberChris Wilson2018-06-051-5/+4
| | | | | | | | | | | | | | | In the near future, I want to subclass gen6_hw_ppgtt as it contains a few specialised members and I wish to add more. To avoid the ugliness of using ppgtt->base.base, rename the i915_hw_ppgtt base member (i915_address_space) as vm, which is our common shorthand for an i915_address_space local. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk
* drm/i915/error: Fixup inactive/active countingChris Wilson2018-06-051-2/+2
| | | | | | | | | | | | | The inactive counter was over the active list, and vice versa. Fortuitously this should not cause a problem in practice as they shared the same array and clamped the number of entries they would write. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180605160623.30163-1-chris@chris-wilson.co.uk
* drm/i915: Store a pointer to intel_context in i915_requestChris Wilson2018-05-181-2/+1
| | | | | | | | | | | | | | | | | To ease the frequent and ugly pointer dance of &request->gem_context->engine[request->engine->id] during request submission, store that pointer as request->hw_context. One major advantage that we will exploit later is that this decouples the logical context state from the engine itself. v2: Set mock_context->ops so we don't crash and burn in selftests. Cleanups from Tvrtko. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-3-chris@chris-wilson.co.uk
* drm/i915: Move request->ctx asideChris Wilson2018-05-181-8/+10
| | | | | | | | | | | | | | In the next patch, we want to store the intel_context pointer inside i915_request, as it is frequently access via a convoluted dance when submitting the request to hw. Having two context pointers inside i915_request leads to confusion so first rename the existing i915_gem_context pointer to i915_request.gem_context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-1-chris@chris-wilson.co.uk
* drm/i915/icl: Read the correct Gen11 interrupt registersOscar Mateo2018-05-171-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stop reading some now deprecated interrupt registers in both debugfs and error state. Instead, read the new equivalents in the Gen11 interrupt repartitioning scheme. Note that the equivalent to the PM ISR & IIR cannot be read without affecting the current state of the system, so I've opted for leaving them out. See gen11_reset_one_iir() for more info. v2: else if !!! (Paulo) v3: another else if (Vinay) v4: - Rebased - Renamed patch - Improved the ordering of GENs - Improved the printing of per-GEN info v5: Avoid maybe-unitialized & add comment explaining the lack of PM ISR & IIR Suggested-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> [Paulo: fix commit message and coding style.] Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1525989595-18220-1-git-send-email-oscar.mateo@intel.com
* drm/i915: Split i915_gem_timeline into individual timelinesChris Wilson2018-05-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | We need to move to a more flexible timeline that doesn't assume one fence context per engine, and so allow for a single timeline to be used across a combination of engines. This means that preallocating a fence context per engine is now a hindrance, and so we want to introduce the singular timeline. From the code perspective, this has the notable advantage of clearing up a lot of mirky semantics and some clumsy pointer chasing. By splitting the timeline up into a single entity rather than an array of per-engine timelines, we can realise the goal of the previous patch of tracking the timeline alongside the ring. v2: Tweak wait_for_idle to stop the compiling thinking that ret may be uninitialised. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-2-chris@chris-wilson.co.uk
* drm/i915: Show ring->start for the ELSP context/request queueChris Wilson2018-05-021-2/+3
| | | | | | | | | | | | | Since the advent of execlists, the HW no longer executes from a single statically assigned ring, but instead switches to a different ring for each context (logical ringbuffer contexts as it is called). So a good way to tally the executing context against what we have queued is by comparing the RING_START register against our requests. Make it so. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180502104150.29874-1-chris@chris-wilson.co.uk
* drm/i915: Print error state times relative to captureMika Kuoppala2018-05-021-9/+38
| | | | | | | | | | | | | | | | | | | Using plain jiffies in error state output makes the output time differences relative to the current system time. This is wrong as it makes output time differences dependent of when the error state is printed rather than when it is captured. Store capture jiffies into error state and use it when outputting the state to fix time differences output. v2: use engine timestamp as epoch, output formatting (Chris) v3: pass epoch to print_engine/request (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180430075259.4476-1-mika.kuoppala@linux.intel.com