summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915
Commit message (Collapse)AuthorAgeFilesLines
* drm/i915/adlp: Fix typo for reference clockChaitanya Kumar Borah2023-01-301-1/+1
| | | | | | | | | | | | | Fix typo for reference clock from 24400 to 24000. Bspec: 55409 Fixes: 626426ff9ce4 ("drm/i915/adl_p: Add cdclk support for ADL-P") Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230112094131.550252-1-chaitanya.kumar.borah@intel.com (cherry picked from commit 2b6f7e39ccae065abfbe3b6e562ec95ccad09f1e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* drm/i915: Fix potential bit_17 double-freeRob Clark2023-01-301-4/+5
| | | | | | | | | | | | | | | | | | A userspace with multiple threads racing I915_GEM_SET_TILING to set the tiling to I915_TILING_NONE could trigger a double free of the bit_17 bitmask. (Or conversely leak memory on the transition to tiled.) Move allocation/free'ing of the bitmask within the section protected by the obj lock. Signed-off-by: Rob Clark <robdclark@chromium.org> Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex") Cc: <stable@vger.kernel.org> # v5.5+ [tursulin: Correct fixes tag and added cc stable.] Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127200550.3531984-1-robdclark@gmail.com (cherry picked from commit 10e0cbaaf1104f449d695c80bcacf930dcd3c42e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* drm/i915: Fix up locking around dumping requests listsJohn Harrison2023-01-305-62/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The debugfs dump of requests was confused about what state requires the execlist lock versus the GuC lock. There was also a bunch of duplicated messy code between it and the error capture code. So refactor the hung request search into a re-usable function. And reduce the span of the execlist state lock to only the execlist specific code paths. In order to do that, also move the report of hold count (which is an execlist only concept) from the top level dump function to the lower level execlist specific function. Also, move the execlist specific code into the execlist source file. v2: Rename some functions and move to more appropriate files (Daniele). v3: Rename new execlist dump function (Daniele) Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Michael Cheng <michael.cheng@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Bruce Chang <yu.bruce.chang@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-4-John.C.Harrison@Intel.com (cherry picked from commit a4be3dca53172d9d2091e4b474fb795c81ed3d6c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* drm/i915: Fix request ref counting during error capture & debugfs dumpJohn Harrison2023-01-305-12/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When GuC support was added to error capture, the reference counting around the request object was broken. Fix it up. The context based search manages the spinlocking around the search internally. So it needs to grab the reference count internally as well. The execlist only request based search relies on external locking, so it needs an external reference count but within the spinlock not outside it. The only other caller of the context based search is the code for dumping engine state to debugfs. That code wasn't previously getting an explicit reference at all as it does everything while holding the execlist specific spinlock. So, that needs updaing as well as that spinlock doesn't help when using GuC submission. Rather than trying to conditionally get/put depending on submission model, just change it to always do the get/put. v2: Explicitly document adding an extra blank line in some dense code (Andy Shevchenko). Fix multiple potential null pointer derefs in case of no request found (some spotted by Tvrtko, but there was more!). Also fix a leaked request in case of !started and another in __guc_reset_context now that intel_context_find_active_request is actually reference counting the returned request. v3: Add a _get suffix to intel_context_find_active_request now that it grabs a reference (Daniele). v4: Split the intel_guc_find_hung_context change to a separate patch and rename intel_context_find_active_request_get to intel_context_get_active_request (Tvrtko). v5: s/locking/reference counting/ in commit message (Tvrtko) Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC") Fixes: 573ba126aef3 ("drm/i915/guc: Capture error state on context reset") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Michael Cheng <michael.cheng@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Bruce Chang <yu.bruce.chang@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-3-John.C.Harrison@Intel.com (cherry picked from commit 3700e353781e27f1bc7222f51f2cc36cbeb9b4ec) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* drm/i915/guc: Fix locking when searching for a hung requestJohn Harrison2023-01-301-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | intel_guc_find_hung_context() was not acquiring the correct spinlock before searching the request list. So fix that up. While at it, add some extra whitespace padding for readability. Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Michael Cheng <michael.cheng@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Cc: Bruce Chang <yu.bruce.chang@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-2-John.C.Harrison@Intel.com (cherry picked from commit d1c3717501bcf56536e8b8c1bdaf5cd5357f6bb2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* drm/i915: Avoid potential vm use-after-freeRob Clark2023-01-301-3/+11
| | | | | | | | | | | | | | | Adding the vm to the vm_xa table makes it visible to userspace, which could try to race with us to close the vm. So we need to take our extra reference before putting it in the table. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Fixes: 9ec8795e7d91 ("drm/i915: Drop __rcu from gem_context->vm") Cc: <stable@vger.kernel.org> # v5.16+ Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230119173321.2825472-1-robdclark@gmail.com (cherry picked from commit 99343c46d4e2b34c285d3d5f68ff04274c2f9fb4) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* Merge tag 'drm-misc-fixes-2023-01-26' of ↵Dave Airlie2023-01-271-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A fix and a preliminary patch to fix a memory leak in i915, and a use after free fix for fbdev deferred io Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20230126104018.cbrcjxl5wefdbb2f@houat
| * drm/i915: Fix a memory leak with reused mmap_offsetNirmoy Das2023-01-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | drm_vma_node_allow() and drm_vma_node_revoke() should be called in balanced pairs. We call drm_vma_node_allow() once per-file everytime a user calls mmap_offset, but only call drm_vma_node_revoke once per-file on each mmap_offset. As the mmap_offset is reused by the client, the per-file vm_count may remain non-zero and the rbtree leaked. Call drm_vma_node_allow_once() instead to prevent that memory leak. Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Fixes: 786555987207 ("drm/i915/gem: Store mmap_offsets in an rbtree rather than a plain list") Reported-by: Chuansheng Liu <chuansheng.liu@intel.com> Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20230117175236.22317-2-nirmoy.das@intel.com Signed-off-by: Maxime Ripard <maxime@cerno.tech>
* | drm/i915/selftest: fix intel_selftest_modify_policy argument typesArnd Bergmann2023-01-231-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The definition of intel_selftest_modify_policy() does not match the declaration, as gcc-13 points out: drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c:29:5: error: conflicting types for 'intel_selftest_modify_policy' due to enum/integer mismatch; have 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, u32)' {aka 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, unsigned int)'} [-Werror=enum-int-mismatch] 29 | int intel_selftest_modify_policy(struct intel_engine_cs *engine, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c:11: drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.h:28:5: note: previous declaration of 'intel_selftest_modify_policy' with type 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, enum selftest_scheduler_modify)' 28 | int intel_selftest_modify_policy(struct intel_engine_cs *engine, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ Change the type in the definition to match. Fixes: 617e87c05c72 ("drm/i915/selftest: Fix hangcheck self test for GuC submission") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230117163743.1003219-1-arnd@kernel.org (cherry picked from commit 8d7eb8ed3f83f248e01a4f548d9c500a950a2c2d) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* | drm/i915/mtl: Fix bcs default contextLucas De Marchi2023-01-231-36/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 0d0e7d1eea9e ("drm/i915/mtl: Define engine context layouts") added the engine context for Meteor Lake. In a second revision of the patch it was believed the xcs offsets were wrong due to a tagging issue in the spec. The first version was actually correct, as shown by the intel_lrc_live_selftests/live_lrc_layout test: i915: Running gt_lrc i915: Running intel_lrc_live_selftests/live_lrc_layout bcs0: LRI command mismatch at dword 1, expected 1108101d found 11081019 [drm:drm_helper_probe_single_connector_modes [drm_kms_helper]] [CONNECTOR:236:DP-1] disconnected bcs0: HW register image: [0000] 00000000 1108101d 00022244 ffff0008 00022034 00000088 00022030 00000088 ... bcs0: SW register image: [0000] 00000000 11081019 00022244 00090009 00022034 00000000 00022030 00000000 The difference in the 2 additional dwords (0x1d vs 0x19) are the offsets 0x120 / 0x124 that are indeed part of the context image. Bspec: 45585 Fixes: 0d0e7d1eea9e ("drm/i915/mtl: Define engine context layouts") Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230111235531.3353815-2-radhakrishna.sripada@intel.com (cherry picked from commit ca54a9a32da0f0ef7e5cbcd111b66f3c9d78b7d2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* | Merge tag 'drm-misc-fixes-2023-01-19' of ↵Dave Airlie2023-01-202-5/+6
|\| | | | | | | | | | | | | | | | | | | | | | | | | git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A fix for vc4 to address a memory leak when allocating a buffer, a Kconfig fix for panfrost and two fixes for i915 and fb-helper to address some bugs with vga-switcheroo. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20230119082059.h32bs7zqoxmjbcvn@houat
| * drm/i915: Remove unused variableNirmoy Das2023-01-181-2/+0
| | | | | | | | | | | | | | | | | | | | | | Removed unused i915 var. Fixes: a273e95721e9 ("drm/i915: Allow switching away via vga-switcheroo if uninitialized") Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230118170624.9326-1-nirmoy.das@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
| * drm/i915: Allow switching away via vga-switcheroo if uninitializedThomas Zimmermann2023-01-182-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Always allow switching away via vga-switcheroo if the display is uninitalized. Instead prevent switching to i915 if the device has not been initialized. This issue was introduced by commit 5df7bd130818 ("drm/i915: skip display initialization when there is no display") protected, which protects code paths from being executed on uninitialized devices. In the case of vga-switcheroo, we want to allow a switch away from i915's device. So run vga_switcheroo_process_delayed_switch() and test in the switcheroo callbacks if the i915 device is available. Fixes: 5df7bd130818 ("drm/i915: skip display initialization when there is no display") Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: "Ville Syrjälä" <ville.syrjala@linux.intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: "Jouni Högander" <jouni.hogander@intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: "José Roberto de Souza" <jose.souza@intel.com> Cc: Julia Lawall <Julia.Lawall@inria.fr> Cc: intel-gfx@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.14+ Link: https://patchwork.freedesktop.org/patch/msgid/20230116115425.13484-2-tzimmermann@suse.de
* | drm/i915/dg2: Introduce Wa_18019271663Matt Atwood2023-01-182-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | Wa_18019271663 applies to all DG2 steppings and skus. Bspec: 66622 Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221123183648.407058-2-matthew.s.atwood@intel.com (cherry picked from commit 900a80c5836587d95db32742f66e1f34f7b40fcb) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* | drm/i915/dg2: Introduce Wa_18018764978Matt Atwood2023-01-182-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Wa_18018764978 applies to specific steppings of DG2 (G10 C0+, G11 and G12 A0+). Clean up style in function at the same time. Bspec: 66622 Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221123183648.407058-1-matthew.s.atwood@intel.com (cherry picked from commit 468a4e630c7da8cf586f85cc498d6097aed1ab4b) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* | drm/i915/selftests: Unwind hugepages to drop wakeref on errorChris Wilson2023-01-181-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure that upon error after we have acquired the wakeref we do release it again. v2: add another missing "goto out_wf"(Andi). Fixes: 027c38b4121e ("drm/i915/selftests: Grab the runtime pm in shrink_thp") Cc: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230117123234.26487-1-nirmoy.das@intel.com (cherry picked from commit 14ec40a88210151296fff3e981c1a7196ad9bf55) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* | drm/i915: re-disable RC6p on Sandy BridgeSasa Dragic2023-01-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RC6p on Sandy Bridge got re-enabled over time, causing visual glitches and GPU hangs. Disabled originally in commit 1c8ecf80fdee ("drm/i915: do not enable RC6p on Sandy Bridge"). Signed-off-by: Sasa Dragic <sasa.dragic@gmail.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221219172927.9603-2-sasa.dragic@gmail.com Fixes: fb6db0f5bf1d ("drm/i915: Remove unsafe i915.enable_rc6") Fixes: 13c5a577b342 ("drm/i915/gt: Select the deepest available parking mode for rc6") Cc: stable@vger.kernel.org Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 0c8a6e9ea232c221976a0670256bd861408d9917) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* | drm/i915/display: Check source height is > 0Drew Davenport2023-01-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The error message suggests that the height of the src rect must be at least 1. Reject source with height of 0. Cc: stable@vger.kernel.org Signed-off-by: Drew Davenport <ddavenport@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221226225246.1.I15dff7bb5a0e485c862eae61a69096caf12ef29f@changeid (cherry picked from commit 0fe76b198d482b41771a8d17b45fb726d13083cf) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* | drm/i915/gt: Cover rest of SVG unit MCR registersGustavo Sousa2023-01-112-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CHICKEN_RASTER_{1,2} got overlooked with the move done in commit a9e69428b1b4 ("drm/i915: Define MCR registers explicitly"). Registers from the SVG unit became multicast as of Xe_HP graphics. BSpec: 66534 Fixes: a9e69428b1b4 ("drm/i915: Define MCR registers explicitly") Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230105133701.19556-1-gustavo.sousa@intel.com (cherry picked from commit 10903b0a0f4d4964b352fa3df12d3d2ef5fb7a3b) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* | drm/i915/gt: Reset twiceChris Wilson2023-01-101-6/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After applying an engine reset, on some platforms like Jasperlake, we occasionally detect that the engine state is not cleared until shortly after the resume. As we try to resume the engine with volatile internal state, the first request fails with a spurious CS event (it looks like it reports a lite-restore to the hung context, instead of the expected idle->active context switch). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221212161338.1007659-1-andi.shyti@linux.intel.com (cherry picked from commit 3db9d590557da3aa2c952f2fecd3e9b703dad790) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* | drm/i915: Fix potential context UAFsRob Clark2023-01-091-6/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gem_context_register() makes the context visible to userspace, and which point a separate thread can trigger the I915_GEM_CONTEXT_DESTROY ioctl. So we need to ensure that nothing uses the ctx ptr after this. And we need to ensure that adding the ctx to the xarray is the *last* thing that gem_context_register() does with the ctx pointer. Signed-off-by: Rob Clark <robdclark@chromium.org> Fixes: eb4dedae920a ("drm/i915/gem: Delay tracking the GEM context until it is registered") Fixes: a4c1cdd34e2c ("drm/i915/gem: Delay context creation (v3)") Fixes: 49bd54b390c2 ("drm/i915: Track all user contexts per client") Cc: <stable@vger.kernel.org> # v5.10+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> [tursulin: Stable and fixes tags add/tidy.] Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230103234948.1218393-1-robdclark@gmail.com (cherry picked from commit bed4b455cf5374e68879be56971c1da563bcd90c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* | drm/i915: Reserve enough fence slot for i915_vma_unbind_asyncNirmoy Das2023-01-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A nested dma_resv_reserve_fences(1) will not reserve slot from the 2nd call onwards and folowing dma_resv_add_fence() might hit the "BUG_ON(fobj->num_fences >= fobj->max_fences)" check. I915 hit above nested dma_resv case in ttm_bo_handle_move_mem() with async unbind: dma_resv_reserve_fences() from --> ttm_bo_handle_move_mem() dma_resv_reserve_fences() from --> i915_vma_unbind_async() dma_resv_add_fence() from --> i915_vma_unbind_async() dma_resv_add_fence() from -->ttm_bo_move_accel_cleanup() Resolve this by adding an extra fence in i915_vma_unbind_async(). Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: 2f6b90da9192 ("drm/i915: Use vma resources for async unbinding") Cc: <stable@vger.kernel.org> # v5.18+ Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221223092011.11657-1-nirmoy.das@intel.com (cherry picked from commit 4f0755c2faf7388616109717facc5bbde6850e60) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* | drm/i915/gvt: fix double free bug in split_2MB_gtt_entryZheng Wang2023-01-041-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If intel_gvt_dma_map_guest_page failed, it will call ppgtt_invalidate_spt, which will finally free the spt. But the caller function ppgtt_populate_spt_by_guest_entry does not notice that, it will free spt again in its error path. Fix this by canceling the mapping of DMA address and freeing sub_spt. Besides, leave the handle of spt destroy to caller function instead of callee function when error occurs. Fixes: b901b252b6cf ("drm/i915/gvt: Add 2M huge gtt support") Signed-off-by: Zheng Wang <zyytlz.wz@163.com> Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20221229165641.1192455-1-zyytlz.wz@163.com
* | drm/i915/gvt: use atomic operations to change the vGPU statusZhi Wang2023-01-048-40/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several vGPU status are used to decide the availability of GVT-g core logics when creating a vGPU. Use atomic operations on changing the vGPU status to avoid the racing. Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: intel-gvt-dev@lists.freedesktop.org Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20221110122034.3382-2-zhi.a.wang@intel.com
* | drm/i915/gvt: fix vgpu debugfs clean in removeZhenyu Wang2023-01-041-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check carefully on root debugfs available when destroying vgpu, e.g in remove case drm minor's debugfs root might already be destroyed, which led to kernel oops like below. Console: switching to colour dummy device 80x25 i915 0000:00:02.0: MDEV: Unregistering intel_vgpu_mdev b1338b2d-a709-4c23-b766-cc436c36cdf0: Removing from iommu group 14 BUG: kernel NULL pointer dereference, address: 0000000000000150 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 3 PID: 1046 Comm: driverctl Not tainted 6.1.0-rc2+ #6 Hardware name: HP HP ProDesk 600 G3 MT/829D, BIOS P02 Ver. 02.44 09/13/2022 RIP: 0010:__lock_acquire+0x5e2/0x1f90 Code: 87 ad 09 00 00 39 05 e1 1e cc 02 0f 82 f1 09 00 00 ba 01 00 00 00 48 83 c4 48 89 d0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 45 31 ff <48> 81 3f 60 9e c2 b6 45 0f 45 f8 83 fe 01 0f 87 55 fa ff ff 89 f0 RSP: 0018:ffff9f770274f948 EFLAGS: 00010046 RAX: 0000000000000003 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000150 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 R10: ffff8895d1173300 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000150 R14: 0000000000000000 R15: 0000000000000000 FS: 00007fc9b2ba0740(0000) GS:ffff889cdfcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000150 CR3: 000000010fd93005 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> lock_acquire+0xbf/0x2b0 ? simple_recursive_removal+0xa5/0x2b0 ? lock_release+0x13d/0x2d0 down_write+0x2a/0xd0 ? simple_recursive_removal+0xa5/0x2b0 simple_recursive_removal+0xa5/0x2b0 ? start_creating.part.0+0x110/0x110 ? _raw_spin_unlock+0x29/0x40 debugfs_remove+0x40/0x60 intel_gvt_debugfs_remove_vgpu+0x15/0x30 [kvmgt] intel_gvt_destroy_vgpu+0x60/0x100 [kvmgt] intel_vgpu_release_dev+0xe/0x20 [kvmgt] device_release+0x30/0x80 kobject_put+0x79/0x1b0 device_release_driver_internal+0x1b8/0x230 bus_remove_device+0xec/0x160 device_del+0x189/0x400 ? up_write+0x9c/0x1b0 ? mdev_device_remove_common+0x60/0x60 [mdev] mdev_device_remove_common+0x22/0x60 [mdev] mdev_device_remove_cb+0x17/0x20 [mdev] device_for_each_child+0x56/0x80 mdev_unregister_parent+0x5a/0x81 [mdev] intel_gvt_clean_device+0x2d/0xe0 [kvmgt] intel_gvt_driver_remove+0x2e/0xb0 [i915] i915_driver_remove+0xac/0x100 [i915] i915_pci_remove+0x1a/0x30 [i915] pci_device_remove+0x31/0xa0 device_release_driver_internal+0x1b8/0x230 unbind_store+0xd8/0x100 kernfs_fop_write_iter+0x156/0x210 vfs_write+0x236/0x4a0 ksys_write+0x61/0xd0 do_syscall_64+0x55/0x80 ? find_held_lock+0x2b/0x80 ? lock_release+0x13d/0x2d0 ? up_read+0x17/0x20 ? lock_is_held_type+0xe3/0x140 ? asm_exc_page_fault+0x22/0x30 ? lockdep_hardirqs_on+0x7d/0x100 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fc9b2c9e0c4 Code: 15 71 7d 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 80 3d 3d 05 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48 RSP: 002b:00007ffec29c81c8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fc9b2c9e0c4 RDX: 000000000000000d RSI: 0000559f8b5f48a0 RDI: 0000000000000001 RBP: 0000559f8b5f48a0 R08: 0000559f8b5f3540 R09: 00007fc9b2d76d30 R10: 0000000000000000 R11: 0000000000000202 R12: 000000000000000d R13: 00007fc9b2d77780 R14: 000000000000000d R15: 00007fc9b2d72a00 </TASK> Modules linked in: sunrpc intel_rapl_msr intel_rapl_common intel_pmc_core_pltdrv intel_pmc_core intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ee1004 igbvf rapl vfat fat intel_cstate intel_uncore pktcdvd i2c_i801 pcspkr wmi_bmof i2c_smbus acpi_pad vfio_pci vfio_pci_core vfio_virqfd zram fuse dm_multipath kvmgt mdev vfio_iommu_type1 vfio kvm irqbypass i915 nvme e1000e igb nvme_core crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic serio_raw ghash_clmulni_intel sha512_ssse3 dca drm_buddy intel_gtt video wmi drm_display_helper ttm CR2: 0000000000000150 ---[ end trace 0000000000000000 ]--- Cc: Wang Zhi <zhi.a.wang@intel.com> Cc: He Yu <yu.he@intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Tested-by: Yu He <yu.he@intel.com> Fixes: bc7b0be316ae ("drm/i915/gvt: Add basic debugfs infrastructure") Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20221219140357.769557-2-zhenyuw@linux.intel.com
* | drm/i915/gvt: fix gvt debugfs destroyZhenyu Wang2023-01-041-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When gvt debug fs is destroyed, need to have a sane check if drm minor's debugfs root is still available or not, otherwise in case like device remove through unbinding, drm minor's debugfs directory has already been removed, then intel_gvt_debugfs_clean() would act upon dangling pointer like below oops. i915 0000:00:02.0: Direct firmware load for i915/gvt/vid_0x8086_did_0x1926_rid_0x0a.golden_hw_state failed with error -2 i915 0000:00:02.0: MDEV: Registered Console: switching to colour dummy device 80x25 i915 0000:00:02.0: MDEV: Unregistering BUG: kernel NULL pointer dereference, address: 00000000000000a0 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP PTI CPU: 2 PID: 2486 Comm: gfx-unbind.sh Tainted: G I 6.1.0-rc8+ #15 Hardware name: Dell Inc. XPS 13 9350/0JXC1H, BIOS 1.13.0 02/10/2020 RIP: 0010:down_write+0x1f/0x90 Code: 1d ff ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 48 89 fb e8 62 c0 ff ff bf 01 00 00 00 e8 28 5e 31 ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 33 65 48 8b 04 25 c0 bd 01 00 48 89 43 08 bf 01 RSP: 0018:ffff9eb3036ffcc8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000000000a0 RCX: ffffff8100000000 RDX: 0000000000000001 RSI: 0000000000000064 RDI: ffffffffa48787a8 RBP: ffff9eb3036ffd30 R08: ffffeb1fc45a0608 R09: ffffeb1fc45a05c0 R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000 R13: ffff91acc33fa328 R14: ffff91acc033f080 R15: ffff91acced533e0 FS: 00007f6947bba740(0000) GS:ffff91ae36d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000a0 CR3: 00000001133a2002 CR4: 00000000003706e0 Call Trace: <TASK> simple_recursive_removal+0x9f/0x2a0 ? start_creating.part.0+0x120/0x120 ? _raw_spin_lock+0x13/0x40 debugfs_remove+0x40/0x60 intel_gvt_debugfs_clean+0x15/0x30 [kvmgt] intel_gvt_clean_device+0x49/0xe0 [kvmgt] intel_gvt_driver_remove+0x2f/0xb0 i915_driver_remove+0xa4/0xf0 i915_pci_remove+0x1a/0x30 pci_device_remove+0x33/0xa0 device_release_driver_internal+0x1b2/0x230 unbind_store+0xe0/0x110 kernfs_fop_write_iter+0x11b/0x1f0 vfs_write+0x203/0x3d0 ksys_write+0x63/0xe0 do_syscall_64+0x37/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f6947cb5190 Code: 40 00 48 8b 15 71 9c 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d 51 24 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89 RSP: 002b:00007ffcbac45a28 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f6947cb5190 RDX: 000000000000000d RSI: 0000555e35c866a0 RDI: 0000000000000001 RBP: 0000555e35c866a0 R08: 0000000000000002 R09: 0000555e358cb97c R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000001 R13: 000000000000000d R14: 0000000000000000 R15: 0000555e358cb8e0 </TASK> Modules linked in: kvmgt CR2: 00000000000000a0 ---[ end trace 0000000000000000 ]--- Cc: Wang, Zhi <zhi.a.wang@intel.com> Cc: He, Yu <yu.he@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Fixes: bc7b0be316ae ("drm/i915/gvt: Add basic debugfs infrastructure") Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20221219140357.769557-1-zhenyuw@linux.intel.com
* | drm/i915: unpin on error in intel_vgpu_shadow_mm_pin()Dan Carpenter2023-01-041-0/+1
|/ | | | | | | | | | Call intel_vgpu_unpin_mm() on this error path. Fixes: 418741480809 ("drm/i915/gvt: Adding ppgtt to GVT GEM context after shadow pdps settled.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/Y3OQ5tgZIVxyQ/WV@kili Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
* drm/i915/dsi: fix MIPI_BKLT_EN_1 native GPIO indexJani Nikula2022-12-301-1/+1
| | | | | | | | | | | | | | Due to copy-paste fail, MIPI_BKLT_EN_1 would always use PPS index 1, never 0. Fix the sloppiest commit in recent memory. Fixes: 963bbdb32b47 ("drm/i915/dsi: add support for ICL+ native MIPI GPIO sequence") Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221220140105.313333-1-jani.nikula@intel.com (cherry picked from commit a561933c571798868b5fa42198427a7e6df56c09) Cc: stable@vger.kernel.org # 6.1 Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* drm/i915/dsi: add support for ICL+ native MIPI GPIO sequenceJani Nikula2022-12-303-3/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Starting from ICL, the default for MIPI GPIO sequences seems to be using native GPIOs i.e. GPIOs available in the GPU. These native GPIOs reuse many pins that quite frankly seem scary to poke based on the VBT sequences. We pretty much have to trust that the board is configured such that the relevant HPD, PP_CONTROL and GPIO bits aren't used for anything else. MIPI sequence v4 also adds a flag to fall back to non-native sequences. v5: - Wrap SHOTPLUG_CTL_DDI modification in spin_lock() in icp_irq_handler() too (Ville) - References instead of Closes issue 6131 because this does not fix everything v4: - Wrap SHOTPLUG_CTL_DDI modification in spin_lock_irq() (Ville) v3: - Fix -Wbitwise-conditional-parentheses (kernel test robot <lkp@intel.com>) v2: - Fix HPD pin output set (impacts GPIOs 0 and 5) - Fix GPIO data output direction set (impacts GPIOs 4 and 9) - Reduce register accesses to single intel_de_rwm() References: https://gitlab.freedesktop.org/drm/intel/-/issues/6131 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221219105955.4014451-1-jani.nikula@intel.com (cherry picked from commit f087cfe6fcff58044f7aa3b284965af47f472fb0) Cc: stable@vger.kernel.org # 6.1 Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* drm/i915/uc: Fix two issues with over-size firmware filesJohn Harrison2022-12-301-14/+28
| | | | | | | | | | | | | | | | | | | | | | | | In the case where a firmware file is too large (e.g. someone downloaded a web page ASCII dump from github...), the firmware object is released but the pointer is not zerod. If no other firmware file was found then release would be called again leading to a double kfree. Also, the size check was only being applied to the initial firmware load not any of the subsequent attempts. So move the check into a wrapper that is used for all loads. Fixes: 016241168dc5 ("drm/i915/uc: use different ggtt pin offsets for uc loads") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221221193031.687266-4-John.C.Harrison@Intel.com (cherry picked from commit 4071d98b296a5bc5fd4b15ec651bd05800ec9510) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* drm/i915: improve the catch-all evict to handle lock contentionMatthew Auld2022-12-306-26/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The catch-all evict can fail due to object lock contention, since it only goes as far as trylocking the object, due to us already holding the vm->mutex. Doing a full object lock here can deadlock, since the vm->mutex is always our inner lock. Add another execbuf pass which drops the vm->mutex and then tries to grab the object will the full lock, before then retrying the eviction. This should be good enough for now to fix the immediate regression with userspace seeing -ENOSPC from execbuf due to contended object locks during GTT eviction. v2 (Mani) - Also revamp the docs for the different passes. Testcase: igt@gem_ppgtt@shrink-vs-evict-* Fixes: 7e00897be8bf ("drm/i915: Add object locking to i915_gem_evict_for_node and i915_gem_evict_something, v2.") References: https://gitlab.freedesktop.org/drm/intel/-/issues/7627 References: https://gitlab.freedesktop.org/drm/intel/-/issues/7570 References: https://bugzilla.mozilla.org/show_bug.cgi?id=1779558 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Mani Milani <mani@chromium.org> Cc: <stable@vger.kernel.org> # v5.18+ Reviewed-by: Mani Milani <mani@chromium.org> Tested-by: Mani Milani <mani@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20221216113456.414183-1-matthew.auld@intel.com (cherry picked from commit 801fa7a81f6da533cc5442fc40e32c72b76cd42a) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* drm/i915: Remove __maybe_unused from mtl_infoLucas De Marchi2022-12-301-1/+0
| | | | | | | | | | | | | | | The attribute __maybe_unused should remain only until the respective info is not in the pciidlist. The info can't be added together with its definition because that would cause the driver to automatically probe for the device, while it's still not ready for that. However once pciidlist contains it, the attribute can be removed. Fixes: 7835303982d1 ("drm/i915/mtl: Add MeteorLake PCI IDs") Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221214194944.3670344-1-lucas.demarchi@intel.com (cherry picked from commit 50490ce05b7a50b0bd4108fa7d6db3ca2972fa83) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* drm/i915: fix TLB invalidation for Gen12.50 video and compute enginesAndrzej Hajda2022-12-301-1/+7
| | | | | | | | | | | | | | | In case of Gen12.50 video and compute engines, TLB_INV registers are masked - to modify one bit, corresponding bit in upper half of the register must be enabled, otherwise nothing happens. Fixes: 77fa9efc16a9 ("drm/i915/xehp: Create separate reg definitions for new MCR registers") Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221214075439.402485-1-andrzej.hajda@intel.com (cherry picked from commit 4d5cf7b1680a1e6db327e3c935ef58325cbedb2c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* treewide: Convert del_timer*() to timer_shutdown*()Steven Rostedt (Google)2022-12-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to several bugs caused by timers being re-armed after they are shutdown and just before they are freed, a new state of timers was added called "shutdown". After a timer is set to this state, then it can no longer be re-armed. The following script was run to find all the trivial locations where del_timer() or del_timer_sync() is called in the same function that the object holding the timer is freed. It also ignores any locations where the timer->function is modified between the del_timer*() and the free(), as that is not considered a "trivial" case. This was created by using a coccinelle script and the following commands: $ cat timer.cocci @@ expression ptr, slab; identifier timer, rfield; @@ ( - del_timer(&ptr->timer); + timer_shutdown(&ptr->timer); | - del_timer_sync(&ptr->timer); + timer_shutdown_sync(&ptr->timer); ) ... when strict when != ptr->timer ( kfree_rcu(ptr, rfield); | kmem_cache_free(slab, ptr); | kfree(ptr); ) $ spatch timer.cocci . > /tmp/t.patch $ patch -p1 < /tmp/t.patch Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ] Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ] Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'drm-next-2022-12-23' of git://anongit.freedesktop.org/drm/drmLinus Torvalds2022-12-2310-88/+96
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm fixes from Dave Airlie: "Holiday fixes! Two batches from amd, and one group of i915 changes. amdgpu: - Spelling fix - BO pin fix - Properly handle polaris 10/11 overlap asics - GMC9 fix - SR-IOV suspend fix - DCN 3.1.4 fix - KFD userptr locking fix - SMU13.x fixes - GDS/GWS/OA handling fix - Reserved VMID handling fixes - FRU EEPROM fix - BO validation fixes - Avoid large variable on the stack - S0ix fixes - SMU 13.x fixes - VCN fix - Add missing fence reference amdkfd: - Fix init vm error handling - Fix double release of compute pasid i915 - Documentation fixes - OA-perf related fix - VLV/CHV HDMI/DP audio fix - Display DDI/Transcoder fix - Migrate fixes" * tag 'drm-next-2022-12-23' of git://anongit.freedesktop.org/drm/drm: (39 commits) drm/amdgpu: grab extra fence reference for drm_sched_job_add_dependency drm/amdgpu: enable VCN DPG for GC IP v11.0.4 drm/amdgpu: skip mes self test after s0i3 resume for MES IP v11.0 drm/amd/pm: correct the fan speed retrieving in PWM for some SMU13 asics drm/amd/pm: bump SMU13.0.0 driver_if header to version 0x34 drm/amdgpu: skip MES for S0ix as well since it's part of GFX drm/amd/pm: avoid large variable on kernel stack drm/amdkfd: Fix double release compute pasid drm/amdkfd: Fix kfd_process_device_init_vm error handling drm/amd/pm: update SMU13.0.0 reported maximum shader clock drm/amd/pm: correct SMU13.0.0 pstate profiling clock settings drm/amd/pm: enable GPO dynamic control support for SMU13.0.7 drm/amd/pm: enable GPO dynamic control support for SMU13.0.0 drm/amdgpu: revert "generally allow over-commit during BO allocation" drm/amdgpu: Remove unnecessary domain argument drm/amdgpu: Fix size validation for non-exclusive domains (v4) drm/amdgpu: Check if fru_addr is not NULL (v2) drm/i915/ttm: consider CCS for backup objects drm/i915/migrate: fix corner case in CCS aux copying drm/amdgpu: rework reserved VMID handling ...
| * drm/i915/ttm: consider CCS for backup objectsMatthew Auld2022-12-143-5/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems we can have one or more framebuffers that are still pinned when suspending lmem, in such a case we end up creating a shmem backup object, instead of evicting the object directly, but this will skip copying the CCS aux state, since we don't allocate the extra storage for the CCS pages as part of the ttm_tt construction. Since we can already deal with pinned objects just fine, it doesn't seem too nasty to just extend to support dealing with the CCS aux state, if the object is a pinned framebuffer. This fixes display corruption (like in gnome-shell) seen on DG2 when returning from suspend. Fixes: da0595ae91da ("drm/i915/migrate: Evict and restore the flatccs capable lmem obj") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Cc: <stable@vger.kernel.org> # v5.19+ Tested-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221212171958.82593-2-matthew.auld@intel.com (cherry picked from commit 95df9cc24bee8a09d39c62bcef4319b984814e18) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
| * drm/i915/migrate: fix corner case in CCS aux copyingMatthew Auld2022-12-141-8/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case of lmem -> lmem transfers, which is currently only possible with small-bar systems, we need to ensure we copy the CCS aux state as-is, rather than nuke it. This should fix some nasty display corruption sometimes seen on DG2 small-bar systems, when also using DG2_RC_CCS_CC for the surface. Fixes: e3afc690188b ("drm/i915/display: consider DG2_RC_CCS_CC when migrating buffers") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221212171958.82593-1-matthew.auld@intel.com (cherry picked from commit b29d26fbcb862526d5047caec82878be2eb75c0f) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
| * drm/i915/migrate: Account for the reserved_spaceChris Wilson2022-12-131-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the ring is nearly full when calling into emit_pte(), we might incorrectly trample the reserved_space when constructing the packet to emit the PTEs. This then triggers the GEM_BUG_ON(rq->reserved_space > ring->space) when later submitting the request, since the request itself doesn't have enough space left in the ring to emit things like workarounds, breadcrumbs etc. v2: Fix the whitespace errors Testcase: igt@i915_selftests@live_emit_pte_full_ring Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7535 Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6889 Fixes: cf586021642d ("drm/i915/gt: Pipelined page migration") Signed-off-by: Chris Wilson <chris.p.wilson@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: <stable@vger.kernel.org> # v5.15+ Tested-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221202122844.428006-1-matthew.auld@intel.com (cherry picked from commit 35168a6c4ed53db4f786858bac23b1474fd7d0dc) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
| * drm/i915/display: Don't disable DDI/Transcoder when setting phy test patternKhaled Almahallawy2022-12-131-59/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bspecs has updated recently to remove the restriction to disable DDI/Transcoder before setting PHY test pattern. This update is to address PHY compliance test failures observed on a port with LTTPR. The issue is that when Transc. is disabled, the main link signals fed to LTTPR will be dropped invalidating link training, which will affect the quality of the phy test pattern when the transcoder is enabled again. v2: Update commit message (Clint) v3: Add missing Signed-off in v2 v4: Update Bspec and commit message for pre-gen12 (Jani) Bspec: 50482, 7555 Fixes: 8cdf72711928 ("drm/i915/dp: Program vswing, pre-emphasis, test-pattern") Cc: Imre Deak <imre.deak@intel.com> Cc: Clint Taylor <clinton.a.taylor@intel.com> CC: Jani Nikula <jani.nikula@intel.com> Tested-by: Khaled Almahallawy <khaled.almahallawy@intel.com> Reviewed-by: Clint Taylor <clinton.a.taylor@intel.com> Signed-off-by: Khaled Almahallawy <khaled.almahallawy@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221123220926.170034-1-khaled.almahallawy@intel.com (cherry picked from commit be4a847652056b067d6dc6fe0fc024a9e2e987ca) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
| * drm/i915: Fix VLV/CHV HDMI/DP audio enableVille Syrjälä2022-12-132-8/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Despite what I claimed in commit c3c5dc1d9224 ("drm/i915/audio: Do the vblank waits") the vblank interrupts are in fact not enabled yet when we do the audio enable sequence on VLV/CHV (all other platforms are fine). Reorder the enable sequence on VLV/CHV to match that of the other platforms so that the audio enable happens after the pipe has been enabled. Fixes: c3c5dc1d9224 ("drm/i915/audio: Do the vblank waits") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221207225219.29060-1-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit a467a243554a64b418c14d7531a3b18c03d53bff) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
| * drm/i915/perf: Do not parse context image for HSWUmesh Nerlige Ramappa2022-12-091-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An earlier commit introduced a mechanism to parse the context image to find the OA context control offset. This resulted in an NPD on haswell when gem_context was passed into i915_perf_open_ioctl params. Haswell does not support logical ring contexts, so ensure that the context image is parsed only for platforms with logical ring contexts and also validate lrc_reg_state. v2: Fix build failure v3: Fix checkpatch error Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7432 Fixes: a5c3a3cbf029 ("drm/i915/perf: Determine gen12 oa ctx offset at runtime") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221123235342.713068-1-umesh.nerlige.ramappa@intel.com (cherry picked from commit 95c713d722017b26e301303713d638e0b95b1f68) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
| * drm/i915: Fix documentation for intel_uncore_forcewake_put__lockedMiaoqian Lin2022-12-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | intel_uncore_forcewake_put__locked() is used to release a reference. Fixes: a6111f7b6604 ("drm/i915: Reduce locking in execlist command submission") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221207112909.2655251-1-linmq006@gmail.com (cherry picked from commit 955f4d7176eb154db587ae162ec2b392dc8d5f27) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
| * drm/i915/gt: Correct kerneldoc for intel_gt_mcr_wait_for_reg()Matt Roper2022-12-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | The kerneldoc function name was not updated when this function was converted to a non-fw form. Fixes: 41f425adbce9 ("drm/i915/gt: Manage uncore->lock while waiting on MCR register") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221128233014.4000136-2-matthew.d.roper@intel.com (cherry picked from commit 03b713d029bd17a1ed426590609af79843db95e2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* | Merge tag 'vfio-v6.2-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds2022-12-151-1/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull VFIO updates from Alex Williamson: - Replace deprecated git://github.com link in MAINTAINERS (Palmer Dabbelt) - Simplify vfio/mlx5 with module_pci_driver() helper (Shang XiaoJing) - Drop unnecessary buffer from ACPI call (Rafael Mendonca) - Correct latent missing include issue in iova-bitmap and fix support for unaligned bitmaps. Follow-up with better fix through refactor (Joao Martins) - Rework ccw mdev driver to split private data from parent structure, better aligning with the mdev lifecycle and allowing us to remove a temporary workaround (Eric Farman) - Add an interface to get an estimated migration data size for a device, allowing userspace to make informed decisions, ex. more accurately predicting VM downtime (Yishai Hadas) - Fix minor typo in vfio/mlx5 array declaration (Yishai Hadas) - Simplify module and Kconfig through consolidating SPAPR/EEH code and config options and folding virqfd module into main vfio module (Jason Gunthorpe) - Fix error path from device_register() across all vfio mdev and sample drivers (Alex Williamson) - Define migration pre-copy interface and implement for vfio/mlx5 devices, allowing portions of the device state to be saved while the device continues operation, towards reducing the stop-copy state size (Jason Gunthorpe, Yishai Hadas, Shay Drory) - Implement pre-copy for hisi_acc devices (Shameer Kolothum) - Fixes to mdpy mdev driver remove path and error path on probe (Shang XiaoJing) - vfio/mlx5 fixes for incorrect return after copy_to_user() fault and incorrect buffer freeing (Dan Carpenter) * tag 'vfio-v6.2-rc1' of https://github.com/awilliam/linux-vfio: (42 commits) vfio/mlx5: error pointer dereference in error handling vfio/mlx5: fix error code in mlx5vf_precopy_ioctl() samples: vfio-mdev: Fix missing pci_disable_device() in mdpy_fb_probe() hisi_acc_vfio_pci: Enable PRE_COPY flag hisi_acc_vfio_pci: Move the dev compatibility tests for early check hisi_acc_vfio_pci: Introduce support for PRE_COPY state transitions hisi_acc_vfio_pci: Add support for precopy IOCTL vfio/mlx5: Enable MIGRATION_PRE_COPY flag vfio/mlx5: Fallback to STOP_COPY upon specific PRE_COPY error vfio/mlx5: Introduce multiple loads vfio/mlx5: Consider temporary end of stream as part of PRE_COPY vfio/mlx5: Introduce vfio precopy ioctl implementation vfio/mlx5: Introduce SW headers for migration states vfio/mlx5: Introduce device transitions of PRE_COPY vfio/mlx5: Refactor to use queue based data chunks vfio/mlx5: Refactor migration file state vfio/mlx5: Refactor MKEY usage vfio/mlx5: Refactor PD usage vfio/mlx5: Enforce a single SAVE command at a time vfio: Extend the device migration protocol with PRE_COPY ...
| * | vfio: Remove vfio_free_deviceEric Farman2022-11-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the "mess" sorted out, we should be able to inline the vfio_free_device call introduced by commit cb9ff3f3b84c ("vfio: Add helpers for unifying vfio_device life cycle") and remove them from driver release callbacks. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> # vfio-ap part Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-8-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* | | Merge tag 'hardening-v6.2-rc1' of ↵Linus Torvalds2022-12-142-5/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kernel hardening updates from Kees Cook: - Convert flexible array members, fix -Wstringop-overflow warnings, and fix KCFI function type mismatches that went ignored by maintainers (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook) - Remove the remaining side-effect users of ksize() by converting dma-buf, btrfs, and coredump to using kmalloc_size_roundup(), add more __alloc_size attributes, and introduce full testing of all allocator functions. Finally remove the ksize() side-effect so that each allocation-aware checker can finally behave without exceptions - Introduce oops_limit (default 10,000) and warn_limit (default off) to provide greater granularity of control for panic_on_oops and panic_on_warn (Jann Horn, Kees Cook) - Introduce overflows_type() and castable_to_type() helpers for cleaner overflow checking - Improve code generation for strscpy() and update str*() kern-doc - Convert strscpy and sigphash tests to KUnit, and expand memcpy tests - Always use a non-NULL argument for prepare_kernel_cred() - Disable structleak plugin in FORTIFY KUnit test (Anders Roxell) - Adjust orphan linker section checking to respect CONFIG_WERROR (Xin Li) - Make sure siginfo is cleared for forced SIGKILL (haifeng.xu) - Fix um vs FORTIFY warnings for always-NULL arguments * tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (31 commits) ksmbd: replace one-element arrays with flexible-array members hpet: Replace one-element array with flexible-array member um: virt-pci: Avoid GCC non-NULL warning signal: Initialize the info in ksignal lib: fortify_kunit: build without structleak plugin panic: Expose "warn_count" to sysfs panic: Introduce warn_limit panic: Consolidate open-coded panic_on_warn checks exit: Allow oops_limit to be disabled exit: Expose "oops_count" to sysfs exit: Put an upper limit on how often we can oops panic: Separate sysctl logic from CONFIG_SMP mm/pgtable: Fix multiple -Wstringop-overflow warnings mm: Make ksize() a reporting-only function kunit/fortify: Validate __alloc_size attribute results drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid() drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid() driver core: Add __alloc_size hint to devm allocators overflow: Introduce overflows_type() and castable_to_type() coredump: Proactively round up to kmalloc bucket size ...
| * | | overflow: Introduce overflows_type() and castable_to_type()Kees Cook2022-11-022-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement a robust overflows_type() macro to test if a variable or constant value would overflow another variable or type. This can be used as a constant expression for static_assert() (which requires a constant expression[1][2]) when used on constant values. This must be constructed manually, since __builtin_add_overflow() does not produce a constant expression[3]. Additionally adds castable_to_type(), similar to __same_type(), but for checking if a constant value would overflow if cast to a given type. Add unit tests for overflows_type(), __same_type(), and castable_to_type() to the existing KUnit "overflow" test: [16:03:33] ================== overflow (21 subtests) ================== ... [16:03:33] [PASSED] overflows_type_test [16:03:33] [PASSED] same_type_test [16:03:33] [PASSED] castable_to_type_test [16:03:33] ==================== [PASSED] overflow ===================== [16:03:33] ============================================================ [16:03:33] Testing complete. Ran 21 tests: passed: 21 [16:03:33] Elapsed time: 24.022s total, 0.002s configuring, 22.598s building, 0.767s running [1] https://en.cppreference.com/w/c/language/_Static_assert [2] C11 standard (ISO/IEC 9899:2011): 6.7.10 Static assertions [3] https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html 6.56 Built-in Functions to Perform Arithmetic with Overflow Checking Built-in Function: bool __builtin_add_overflow (type1 a, type2 b, Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Tom Rix <trix@redhat.com> Cc: Daniel Latypov <dlatypov@google.com> Cc: Vitor Massaru Iha <vitor@massaru.org> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-hardening@vger.kernel.org Cc: llvm@lists.linux.dev Co-developed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221024201125.1416422-1-gwan-gyeong.mun@intel.com
* | | | Merge tag 'for-linus-iommufd' of ↵Linus Torvalds2022-12-141-4/+17
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd implementation from Jason Gunthorpe: "iommufd is the user API to control the IOMMU subsystem as it relates to managing IO page tables that point at user space memory. It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO container) which is the VFIO specific interface for a similar idea. We see a broad need for extended features, some being highly IOMMU device specific: - Binding iommu_domain's to PASID/SSID - Userspace IO page tables, for ARM, x86 and S390 - Kernel bypassed invalidation of user page tables - Re-use of the KVM page table in the IOMMU - Dirty page tracking in the IOMMU - Runtime Increase/Decrease of IOPTE size - PRI support with faults resolved in userspace Many of these HW features exist to support VM use cases - for instance the combination of PASID, PRI and Userspace IO Page Tables allows an implementation of DMA Shared Virtual Addressing (vSVA) within a guest. Dirty tracking enables VM live migration with SRIOV devices and PASID support allow creating "scalable IOV" devices, among other things. As these features are fundamental to a VM platform they need to be uniformly exposed to all the driver families that do DMA into VMs, which is currently VFIO and VDPA" For more background, see the extended explanations in Jason's pull request: https://lore.kernel.org/lkml/Y5dzTU8dlmXTbzoJ@nvidia.com/ * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (62 commits) iommufd: Change the order of MSI setup iommufd: Improve a few unclear bits of code iommufd: Fix comment typos vfio: Move vfio group specific code into group.c vfio: Refactor dma APIs for emulated devices vfio: Wrap vfio group module init/clean code into helpers vfio: Refactor vfio_device open and close vfio: Make vfio_device_open() truly device specific vfio: Swap order of vfio_device_container_register() and open_device() vfio: Set device->group in helper function vfio: Create wrappers for group register/unregister vfio: Move the sanity check of the group to vfio_create_group() vfio: Simplify vfio_create_group() iommufd: Allow iommufd to supply /dev/vfio/vfio vfio: Make vfio_container optionally compiled vfio: Move container related MODULE_ALIAS statements into container.c vfio-iommufd: Support iommufd for emulated VFIO devices vfio-iommufd: Support iommufd for physical VFIO devices vfio-iommufd: Allow iommufd to be used in place of a container fd vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent() ...
| * \ \ \ Merge tag 'v6.1-rc7' into iommufd.git for-nextJason Gunthorpe2022-12-0223-158/+302
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resolve conflicts in drivers/vfio/vfio_main.c by using the iommfd version. The rc fix was done a different way when iommufd patches reworked this code. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | | | vfio-iommufd: Support iommufd for emulated VFIO devicesJason Gunthorpe2022-12-021-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Emulated VFIO devices are calling vfio_register_emulated_iommu_dev() and consist of all the mdev drivers. Like the physical drivers, support for iommufd is provided by the driver supplying the correct standard ops. Provide ops from the core that duplicate what vfio_register_emulated_iommu_dev() does. Emulated drivers are where it is more likely to see variation in the iommfd support ops. For instance IDXD will probably need to setup both a iommfd_device context linked to a PASID and an iommufd_access context to support all their mdev operations. Link: https://lore.kernel.org/r/7-v4-42cd2eb0e3eb+335a-vfio_iommufd_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Lixiao Yang <lixiao.yang@intel.com> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> Tested-by: Yu He <yu.he@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>