| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Building drivers/gpu/drm/xe/xe_gt_pagefault.c with GCC 11 results
in the following build errors:
./include/linux/fortify-string.h:57:33: error: writing 16 bytes into a region of size 0 [-Werror=stringop-overflow=]
57 | #define __underlying_memcpy __builtin_memcpy
| ^
./include/linux/fortify-string.h:644:9: note: in expansion of macro ‘__underlying_memcpy’
644 | __underlying_##op(p, q, __fortify_size); \
| ^~~~~~~~~~~~~
./include/linux/fortify-string.h:689:26: note: in expansion of macro ‘__fortify_memcpy_chk’
689 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \
| ^~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/xe/xe_gt_pagefault.c:340:17: note: in expansion of macro ‘memcpy’
340 | memcpy(pf_queue->data + pf_queue->tail, msg, len * sizeof(u32));
| ^~~~~~
In file included from drivers/gpu/drm/xe/xe_device_types.h:17,
from drivers/gpu/drm/xe/xe_vm_types.h:16,
from drivers/gpu/drm/xe/xe_bo.h:13,
from drivers/gpu/drm/xe/xe_gt_pagefault.c:16:
drivers/gpu/drm/xe/xe_gt_types.h:102:25: note: at offset [1144, 265324] into destination object ‘tile’ of size 8
102 | struct xe_tile *tile;
| ^~~~
Fix these by removing -Wstringop-overflow from drm/xe builds.
Closes: https://lore.kernel.org/all/45ad1d0f-a10f-483e-848a-76a30252edbe@paulmck-laptop/
Fixes: 7a8bc11782d3 ("drm/xe: Enable W=1 warnings by default")
Suggested-by: Stephen Rothwell <sfr@rothwell.id.au>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
[ This particular warning is broken on GCC11. In future changes it will
be moved to the normal C flags in the top level Makefile (out of
Makefile.extrawarn), but accounting for the compiler support. Just
remove it out of xe's forced extra warnings for now ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit a109d19992294736abd4f4232ea639e03eb1f9e7)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't set SLPC GuC feature ctl flag if skip_guc_pc is true.
v2: Skip the freq related sysfs creation as well (Badal)
v3: Remove unnecessary parenthesis (Lucas)
Fixes: 975e4a3795d4 ("drm/xe: Manually setup C6 when skip_guc_pc is set")
Fixes: bef52b5c7a19 ("drm/xe: Create a xe_gt_freq component for raw management and sysfs")
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://lore.kernel.org/r/20240108225842.966066-1-vinay.belgaumkar@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 69cac0a8f3ef8db4d62441c4a2686ec676c9facd)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After exec_queue has been created, we cannot simply modify q->priority.
This needs to be done by the backend via q->ops. However in this case,
it would be more efficient to simply pass a flag when creating the
exec_queue and set the desired priority upfront during queue creation.
To that end: new flag EXEC_QUEUE_FLAG_HIGH_PRIORITY is introduced.
The priority field is moved to be with other scheduling properties and
is now exec_queue.sched_props.priority. This is no longer set to initial
value by the backend, but is now set within __xe_exec_queue_create().
Fixes: b4eecedc75c1 ("drm/xe: Fix potential deadlock handling page faults")
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
(cherry picked from commit a8004af338f6b3319476ecbed63ea49bf393fc1f)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to set q->priority prior to calling guc_exec_queue_add_msg() as
that will call init_policies() and sets the scheduling properties to those
stored in the exec_queue.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
(cherry picked from commit b16483f9f8120b530327879fa3ea576e897946da)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pointer points to IO memory, but the __iomem annotation was
incorrectly placed. Annotate it correctly, update its usage accordingly
and fix the corresponding sparse error.
Fixes: d8b52a02cb40 ("drm/xe: Implement stolen memory.")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-5-thomas.hellstrom@linux.intel.com
(cherry picked from commit dcddb6f0b06d454c9a3b2b240a43f0e7310c7f7c)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are a couple of pointers pointing to MMIO space. Annotate them
with __iomem and fix the corresponding sparse warnings.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Fixes: 3b0d4a557996 ("drm/xe: Move register MMIO into xe_tile")
Fixes: 399a13323f0d ("drm/xe: add 28-bit address support in struct xe_reg")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Koby Elbaz <kelbaz@habana.ai>
Cc: Ofir Bitton <obitton@habana.ai>
Cc: Moti Haimovski <mhaimovski@habana.ai>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-4-thomas.hellstrom@linux.intel.com
(cherry picked from commit 9d612ee52c6096bc70d43f54921ba2831ffbf1ad)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pointer points to IO memory, but the __iomem annotation was
incorrectly placed. Annotate it correctly, update its usage accordingly
and fix the corresponding sparse error.
Fixes: 0887a2e7ab62 ("drm/xe: Make xe_mem_region struct")
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-3-thomas.hellstrom@linux.intel.com
(cherry picked from commit 20855b62a30538361e587cfc7c5245f07d4f826a)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The regs pointer points to IO memory. Annotate it properly and
fix the corresponding sparse warning.
Fixes: a4e2f3a299ea ("drm/xe: refactor xe_mmio_probe_tiles to support MMIO extension")
Cc: Koby Elbaz <kelbaz@habana.ai>
Cc: Ofir Bitton <obitton@habana.ai>
Cc: Moti Haimovski <mhaimovski@habana.ai>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-2-thomas.hellstrom@linux.intel.com
(cherry picked from commit 9d03bf30e78673d827484bbc17a6fd8f5e43a039)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If using the VM_BIND_OP_UNMAP_ALL without any bound vmas for the
vm, we will end up dereferencing an uninitialized variable and leak a
bo lock. Fix this.
v2:
- Updated commit message (Lucas De Marchi)
Reported-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Closes: https://lore.kernel.org/intel-xe/jrwua7ckbiozfcaodx4gg2h4taiuxs53j5zlpf3qzvyhyiyl2d@pbs3plurokrj/
Suggested-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Fixes: b06d47be7c83 ("drm/xe: Port Xe to GPUVA")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222175904.16732-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 9d0c1c5618be02c5acda7e6bbb728007b0632984)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intent is to return -EWOULDBLOCK to the user if a long running exec
queue is full during the exec IOCTL. -EWOULDBLOCK aliases to -EAGAIN
which results in the exec IOCTL doing a retry loop. Fix this by ensuring
the retry loop is broken when returning -EWOULDBLOCK.
Fixes: 8ae8a2e8dd21 ("drm/xe: Long running job update")
Reported-by: Sai Gowtham Ch <sai.gowtham.ch@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
(cherry picked from commit 97d0047cbb17318431eaf37dfe1a6855539340f9)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
i915 defines it as unsigned long so Xe should do the same to avoid
compilation warnings:
CC [M] drivers/gpu/drm/i915/i915_gem.o
CC [M] drivers/gpu/drm/xe/i915-display/intel_display_power_well.o
In file included from ./include/drm/drm_mm.h:51,
from drivers/gpu/drm/xe/xe_bo_types.h:11,
from drivers/gpu/drm/xe/xe_bo.h:11,
from ./drivers/gpu/drm/xe/compat-i915-headers/gem/i915_gem_object.h:11,
from ./drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h:15,
from drivers/gpu/drm/i915/display/intel_display_power.c:8:
drivers/gpu/drm/i915/display/intel_display_power.c: In function ‘print_async_put_domains_state’:
drivers/gpu/drm/i915/display/intel_display_power.c:408:29: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=]
408 | drm_dbg(&i915->drm, "async_put_wakeref %lu\n",
| ^~~~~~~~~~~~~~~~~~~~~~~~~
409 | power_domains->async_put_wakeref);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| int
./include/drm/drm_print.h:410:39: note: in definition of macro ‘drm_dev_dbg’
410 | __drm_dev_dbg(NULL, dev, cat, fmt, ##__VA_ARGS__)
| ^~~
./include/drm/drm_print.h:510:33: note: in expansion of macro ‘drm_dbg_driver’
510 | #define drm_dbg(drm, fmt, ...) drm_dbg_driver(drm, fmt, ##__VA_ARGS__)
| ^~~~~~~~~~~~~~
drivers/gpu/drm/i915/display/intel_display_power.c:408:9: note: in expansion of macro ‘drm_dbg’
408 | drm_dbg(&i915->drm, "async_put_wakeref %lu\n",
| ^~~~~~~
drivers/gpu/drm/i915/display/intel_display_power.c:408:50: note: format string is defined here
408 | drm_dbg(&i915->drm, "async_put_wakeref %lu\n",
| ~~^
| |
| long unsigned int
| %u
CC [M] drivers/gpu/drm/i915/i915_gem_evict.o
CC [M] drivers/gpu/drm/i915/i915_gem_gtt.o
CC [M] drivers/gpu/drm/xe/i915-display/intel_display_trace.o
CC [M] drivers/gpu/drm/xe/i915-display/intel_display_wa.o
CC [M] drivers/gpu/drm/i915/i915_query.o
Fixes: 44e694958b95 ("drm/xe/display: Implement display support")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit fdbadf504375886a0320ac6f84c850322a6b32e1)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having a different value for op is not possible: this is already kept
out of user-visible warning by the check in xe_wait_user_fence_ioctl()
if op > MAX_OP. The warning is useful as if this switch() is not update
when a new op is added, it should be triggered.
Fix warning as reported by 0-DAY CI Kernel:
drivers/gpu/drm/xe/xe_wait_user_fence.c:46:2: warning: variable 'passed' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized]
Closes: https://lore.kernel.org/oe-kbuild-all/202312170357.KPSinwPs-lkp@intel.com/
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20231218163301.3453285-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
add_preempt_fences() calls dma_resv_reserve_fences() with num_fences ==
0 resulting in the below UBSAN splat. Short circuit add_preempt_fences()
if num_fences == 0.
[ 58.652241] ================================================================================
[ 58.660736] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
[ 58.667281] shift exponent 64 is too large for 64-bit type 'long unsigned int'
[ 58.674539] CPU: 2 PID: 1170 Comm: xe_gpgpu_fill Not tainted 6.6.0-rc3-guc+ #630
[ 58.674545] Hardware name: Intel Corporation Tiger Lake Client Platform/TigerLake U DDR4 SODIMM RVP, BIOS TGLSFWI1.R00.3243.A01.2006102133 06/10/2020
[ 58.674547] Call Trace:
[ 58.674548] <TASK>
[ 58.674550] dump_stack_lvl+0x92/0xb0
[ 58.674555] __ubsan_handle_shift_out_of_bounds+0x15a/0x300
[ 58.674559] ? rcu_is_watching+0x12/0x60
[ 58.674564] ? software_resume+0x141/0x210
[ 58.674575] ? new_vma+0x44b/0x600 [xe]
[ 58.674606] dma_resv_reserve_fences.cold+0x40/0x66
[ 58.674612] new_vma+0x4b3/0x600 [xe]
[ 58.674638] xe_vm_bind_ioctl+0xffd/0x1e00 [xe]
[ 58.674663] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[ 58.674680] drm_ioctl_kernel+0xc1/0x170
[ 58.674686] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[ 58.674703] drm_ioctl+0x247/0x4c0
[ 58.674709] ? find_held_lock+0x2b/0x80
[ 58.674716] __x64_sys_ioctl+0x8c/0xb0
[ 58.674720] do_syscall_64+0x3c/0x90
[ 58.674723] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 58.674727] RIP: 0033:0x7fce4bd1aaff
[ 58.674730] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
[ 58.674731] RSP: 002b:00007ffc57434050 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 58.674734] RAX: ffffffffffffffda RBX: 00007ffc574340e0 RCX: 00007fce4bd1aaff
[ 58.674736] RDX: 00007ffc574340e0 RSI: 0000000040886445 RDI: 0000000000000003
[ 58.674737] RBP: 0000000040886445 R08: 0000000000000002 R09: 00007ffc574341b0
[ 58.674739] R10: 000055de43eb3780 R11: 0000000000000246 R12: 00007ffc574340e0
[ 58.674740] R13: 0000000000000003 R14: 00007ffc574341b0 R15: 0000000000000001
[ 58.674747] </TASK>
[ 58.674748] ================================================================================
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20231215230203.719244-1-matthew.brost@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into drm-next
This tag contains habanalabs driver changes for v6.8.
The notable changes are:
- uAPI changes:
- Add sysfs entry to allow users to identify a device minor id with its
debugfs path
- Add sysfs entry to expose the device's module id as given to us from
the f/w
- Add signed device information retrieval through the INFO ioctl
- New features and improvements:
- Update documentation of debugfs paths
- Add support for Gaudi2C device (new PCI revision number)
- Add pcie reset prepare/done hooks
- Firmware related fixes and changes:
- Print three instances version numbers of Infineon second stage
- Assume hard-reset is done by f/w upon PCIe AXI drain
- Bug fixes and code cleanups:
- Fix information leak in sec_attest_info()
- Avoid overriding existing undefined opcode data in Gaudi2
- Multiple Queue Manager (QMAN) fixes for Gaudi2
- Set hard reset flag if graceful reset is skipped
- Remove 'get temperature' debug print
- Fix the new Event Queue heartbeat mechanism
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Oded Gabbay <ogabbay@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/ZYFpihZscr/fsRRd@ogabbay-vm-u22.habana-labs.com
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This function may copy the pad0 field of struct hl_info_sec_attest to user
mode which has not been initialized, resulting in leakage of kernel heap
data to user mode. To prevent this, use kzalloc() to allocate and zero out
the buffer, which can also eliminate other uninitialized holes, if any.
Fixes: 0c88760f8f5e ("habanalabs/gaudi2: add secured attestation info uapi")
Signed-off-by: Xingyuan Mo <hdthky0@gmail.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Part of the undefined opcode data is updated in
gaudi2_handle_qman_err_generic() and some in
handle_lower_qman_data_on_err().
However, the 'write_enable' flag is checked only in
gaudi2_handle_qman_err_generic(), and information of more than a single
error can be mixed there.
Moreover, handle_lower_qman_data_on_err() is called only for the lower
QMAN, so for an error in the upper QMAN there is only a partial info.
Move all the data update to be done in a single place, protected by the
'write_enable' flag.
As mainly the lower QMAN's info is interesting, avoid saving the partial
info for the upper QMAN.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The device debugfs directory was modified to be named as the
device-name.
This name is the parent device name, i.e. either the PCI address in case
of an ASIC, or the simulator device name in case of a simulator.
This change makes it more difficult for a user to access the debugfs
directory for a specific accel device, because he can't just use the
accel minor id, but he needs to do more device-dependent operations to
get the device name.
To make it easier to get this name, add a 'parent_device' sysfs
attribute that the user can read using the minor id before accessing
debugfs.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
QM instructions are in multiples of 64 bits and the command type is in
the upper bits of first QWORD.
To make it clearer that an undefined command is due to a type of 0x0,
always print all 64 bits and add a zero padding if needed.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Infineon controller second stage has 3 instances that their version
need to be reported by driver.
Signed-off-by: Ariel Suller <asuller@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
User will provide a nonce via the INFO ioctl, and will retrieve
the signed device info generated using given nonce.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The QM CQ PTR_LO/PTR_HI/TSIZE registers are for pushing a CQ entry, and
although they are updated by HW even when descriptors are fetched by PQ
and CB addresses are fed into CQ, the correct registers to use when
dumping the CQ info are the ones with the _STS suffix.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Module ID exposes the physical location of the device in the server,
from the pov of the devices in regard to how they are connected by
internal fabric.
This information is already exposed in our INFO ioctl, but there are
utilities and scripts running in data-center which are already
accessing sysfs for topology information and it is easier for them
to continue getting that information from sysfs instead of opening
a file descriptor.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Failure to map is considered a non-trivial error and we need to notify
the user about it.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Upon a QM error, the address/size from both the CQ and the ARC_CQ are
printed, although the instruction that led to the error was received
from only one of them.
Moreover, in case of a QM undefined opcode, only one of these
address/size sets will be captured based on the value of ARC_CQ_PTR.
However, this value can be non-zero even if currently the CQ is used, in
case the CQ/ARC_CQ are alternately used.
Under the assumption of having a stop-on-error configuration, modify to
use CP_STS.CUR_CQ field to get the relevant CQ for the QM error.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
hl_device_cond_reset() might be called with the hard reset flag unset,
because a compute reset upon device release as part of a graceful reset
is valid.
If the conditions for graceful reset are not met, hl_device_reset() will
be called for an immediate reset. In this case a compute reset is not
valid, so it will be replaced with a hard reset together with a debug
message about it.
This message might be confusing, as it implies that a compute reset was
requested when it shouldn't. To prevent this confusion, set the hard
reset flag in hl_device_cond_reset() if going to an immediate reset.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The print was added long back for a specific debug and can
now be removed.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
currently the undefined opcode event bit in set only for lower cp and
only if 'write_enable' is true. It should be set anyway and for all
streams in order to report that event to userspace.
Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Stop rescheduling another heartbeat check when EQ heartbeat check fails
as it generates confusing logs in dmesg that the heartbeat fails.
Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| | |
Gaudi2 with PCI revision ID with the value of '3' represents Gaudi2C
device and should be detected and initialized as Gaudi2.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add error log when no eq event is received from FW,
to cover a scenario when FW is stuck for some reason.
In such case driver will not receive neither the eq error interrupt
or the eq heartbeat event, and will just initiate a reset without
indication in the dmesg about the reason.
Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When a PCIe AXI drain event happens, it is possible that the driver
cannot access the device through PCIe, and therefore cannot send a
hard-reset request to FW.
Starting from FW version 1.13, FW will initiate a hard-reset in such
a case without waiting for a reset request from the driver.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Use a predefined mask which set the device critical boot errors.
Driver will fail and stop its loading, only upon detecting at least
one of those errors defined in this mask.
Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When working on a bare-metal system, if FLR will happen the firmware
will handle it and driver will have no knowledge of it, and this will
cause two issues:
1.The driver will be in operational state while it should be in reset.
This will cause the heartbeat mechanism to keep sending messages to FW
while pci device is in reset. Eventually heartbeat will fail and
the device will end up in non-operational state.
2. After FW handles the FLR, and due to the reset it'll go back to
preboot stage, and driver need to perform hard reset in order to
load the boot fit binary.
This patch will add reset_prepare hook that will set the device to
be in disabled state, so it'll be not operational, and also
reset_done hook which will be called after the actual FLR handling,
then it will perform hard reset.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://anongit.freedesktop.org/drm/drm-misc into drm-next
More fixes for the new imagination drier, a DT node refcount fix for the
new aux bridge driver and a missing header fix for the LUT management
code.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/42dw6ok2g5kz5xljrw7t6lzrgafhwslgw3j4rbaaivluv24vkj@k4smx5r3y2gh
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The aux-bridge and aux-hpd-bridge drivers didn't call of_node_get() on
the device nodes further used for dev->of_node and platform data. When
bridge devices are released, the reference counts are decreased,
resulting in refcount underflow / use-after-free warnings. Get
corresponding refcounts during AUX bridge allocation.
Reported-by: Luca Weiss <luca.weiss@fairphone.com>
Fixes: 2a04739139b2 ("drm/bridge: add transparent bridge helper")
Fixes: 26f4bac3d884 ("drm/bridge: aux-hpd: Replace of_device.h with explicit include")
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20231216235910.911958-1-dmitry.baryshkov@linaro.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
It is possible to double free the vm_ctx->mmu_ctx object in this
function.
630 err_page_table_destroy:
--> 631 pvr_mmu_context_destroy(vm_ctx->mmu_ctx);
The pvr_vm_context_put() function does:
kref_put(&vm_ctx->ref_count, pvr_vm_context_release);
Here the pvr_vm_context_release() will call:
pvr_mmu_context_destroy(vm_ctx->mmu_ctx);
Refactor to an unwind style.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Donald Robson <donald.robson@imgtec.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20231213144431.94956-2-donald.robson@imgtec.com
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
drivers/gpu/drm/imagination/pvr_vm.c:631 pvr_vm_create_context()
error: 'vm_ctx->mmu_ctx' dereferencing possible ERR_PTR()
612 vm_ctx->mmu_ctx = pvr_mmu_context_create(pvr_dev);
613 err = PTR_ERR_OR_ZERO(&vm_ctx->mmu_ctx);
^
The address is never an error pointer so this will always return 0.
Remove the ampersand.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Donald Robson <donald.robson@imgtec.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20231213144431.94956-1-donald.robson@imgtec.com
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
While writing the matching IGT suite I discovered that it's possible to
cause a kernel oops when using DRM_IOCTL_PVR_CREATE_HWRT_DATASET when
the call to hwrt_init_common_fw_structure() fails.
Use an unwind-type error path to avoid cleaning up the object using the
the release function before it is fully resolved.
Signed-off-by: Donald Robson <donald.robson@imgtec.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20231208163019.95913-1-donald.robson@imgtec.com
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Unwinding loop in error path for this function uses unsigned limit
variable, causing the promotion of the signed counter variable.
--> 204 for (; pfn >= start_pfn; pfn--)
^^^^^^^^^^^^^^^^
If start_pfn can be zero then this is an endless loop. I've seen this
code in other places as well. This loop is slightly off as well. It
should decrement pfn on the first iteration.
Fix by making the loop limit variables signed. Also fix missing
predecrement by modifying to while loop.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Donald Robson <donald.robson@imgtec.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20231208160825.92933-1-donald.robson@imgtec.com
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add a dependency on CONFIG_64BIT since currently the xe driver doesn't
build on 32bits. It may be enabled again after all the issues are fixed.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231221222809.4123220-2-lucas.demarchi@intel.com
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
Introduce a new DRM driver for Intel GPUs
Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms. The experimental support starts with Tiger Lake.
i915 will continue be the main production driver for the platforms
up to Meteor Lake and Alchemist. Then the goal is to make this Intel
Xe driver the primary driver for Lunar Lake and newer platforms.
It uses most, if not all, of the key drm concepts, in special: TTM,
drm-scheduler, drm-exec, drm-gpuvm/gpuva and others.
Signed-off-by: Dave Airlie <airlied@redhat.com>
[airlied: add an extra X86 check, fix a typo, fix drm_exec_init interface
change].
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZYSwLgXZUZ57qGPQ@intel.com
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
"err" is not initialized when failing to create and add the freq0 sysfs
file. Remove it from the message. This fixes the following warning with
clang:
../drivers/gpu/drm/xe/xe_gt_freq.c:202:30: error: variable 'err' is uninitialized when used here [-Werror,-Wuninitialized]
kobject_name(gt->sysfs), err);
^~~
Fixes: bef52b5c7a19 ("drm/xe: Create a xe_gt_freq component for raw management and sysfs")
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://lore.kernel.org/r/20231220161923.3740489-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
As part of the FW definitions, we declare each blob as required via the
MODULE_FIRMWARE() macro. This causes the initramfs update (or equivalent
process) to look for the blobs on disk when the kernel is installed;
therefore, we need to make sure that all FWs we define are available in
linux-firmware.
We currently don't plan to push the PVC blob to linux-firmware, while the
LNL one will only be pushed once we have machines in CI to test it, so we
need to remove them from the list for now.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Ideally this header could be included without the CONFIG_FAULT_INJECTION
and it would take care itself for the includes it needs.
So, let's temporary workaround this by moving this below and including
only when CONFIG_FAULT_INJECTION is selected to avoid build breakages.
Another solution would be us including the linux/types.h as well, but
this creates unnecessary cases.
Reference: https://lore.kernel.org/all/20230816134748.979231-1-himal.prasad.ghimiray@intel.com/
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Oded Gabbay <ogabbay@kernel.org>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This kernel uevent is getting removed for now. It will come
back later with a better future proof name.
v2: Rebase (Francois Dugast)
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The uAPI should stay generic in regarding to the bitmask. It is
the userspace responsibility to check for the type/class of the
memory, without any assumption.
Also add comments inside the code to explain how it is actually
constructed so we don't accidentally change the assignment of
the instance and the masks.
No functional change in this patch. It only explains and document
the memory_region masks. A further follow-up work with the
organization of all memory regions around struct xe_mem_regions
is desired, but not part of this patch.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Remove concept of async vs sync VM bind queues, rather make all binds
async.
The following bits have dropped from the uAPI:
DRM_XE_ENGINE_CLASS_VM_BIND_ASYNC
DRM_XE_ENGINE_CLASS_VM_BIND_SYNC
DRM_XE_VM_CREATE_FLAG_ASYNC_DEFAULT
DRM_XE_VM_BIND_FLAG_ASYNC
To implement sync binds the UMD is expected to use the out-fence
interface.
v2: Send correct version
v3: Drop drm_xe_syncs
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
To ensure consistency and avoid possible later conflicts,
let's add drm_xe prefix to xe_user_extension struct.
Cc: Francois Dugast <francois.dugast@intel.com>
Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
PMU uapi is likely to change in the future. Till the uapi is finalized,
remove PMU from Xe. PMU can be re-added after uapi is finalized.
v2: Include xe_drm.h in xe/tests/xe_dma_buf.c (Francois)
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Acked-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently xe_wait_user_fence_ioctl is not checking exec_queue state
and blocking until timeout, with this patch wakeup the blocking wait
if exec_queue reset happen and returning proper error code
Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Kempczynski Zbigniew <Zbigniew.Kempczynski@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|