summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/nouveau/nvkm/core
Commit message (Collapse)AuthorAgeFilesLines
* nouveau: lock the client object tree.Dave Airlie2024-03-082-6/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It appears the client object tree has no locking unless I've missed something else. Fix races around adding/removing client objects, mostly vram bar mappings. 4562.099306] general protection fault, probably for non-canonical address 0x6677ed422bceb80c: 0000 [#1] PREEMPT SMP PTI [ 4562.099314] CPU: 2 PID: 23171 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 [ 4562.099324] Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 [ 4562.099330] RIP: 0010:nvkm_object_search+0x1d/0x70 [nouveau] [ 4562.099503] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 48 89 f8 48 85 f6 74 39 48 8b 87 a0 00 00 00 48 85 c0 74 12 <48> 8b 48 f8 48 39 ce 73 15 48 8b 40 10 48 85 c0 75 ee 48 c7 c0 fe [ 4562.099506] RSP: 0000:ffffa94cc420bbf8 EFLAGS: 00010206 [ 4562.099512] RAX: 6677ed422bceb814 RBX: ffff98108791f400 RCX: ffff9810f26b8f58 [ 4562.099517] RDX: 0000000000000000 RSI: ffff9810f26b9158 RDI: ffff98108791f400 [ 4562.099519] RBP: ffff9810f26b9158 R08: 0000000000000000 R09: 0000000000000000 [ 4562.099521] R10: ffffa94cc420bc48 R11: 0000000000000001 R12: ffff9810f02a7cc0 [ 4562.099526] R13: 0000000000000000 R14: 00000000000000ff R15: 0000000000000007 [ 4562.099528] FS: 00007f629c5017c0(0000) GS:ffff98142c700000(0000) knlGS:0000000000000000 [ 4562.099534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4562.099536] CR2: 00007f629a882000 CR3: 000000017019e004 CR4: 00000000003706f0 [ 4562.099541] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4562.099542] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4562.099544] Call Trace: [ 4562.099555] <TASK> [ 4562.099573] ? die_addr+0x36/0x90 [ 4562.099583] ? exc_general_protection+0x246/0x4a0 [ 4562.099593] ? asm_exc_general_protection+0x26/0x30 [ 4562.099600] ? nvkm_object_search+0x1d/0x70 [nouveau] [ 4562.099730] nvkm_ioctl+0xa1/0x250 [nouveau] [ 4562.099861] nvif_object_map_handle+0xc8/0x180 [nouveau] [ 4562.099986] nouveau_ttm_io_mem_reserve+0x122/0x270 [nouveau] [ 4562.100156] ? dma_resv_test_signaled+0x26/0xb0 [ 4562.100163] ttm_bo_vm_fault_reserved+0x97/0x3c0 [ttm] [ 4562.100182] ? __mutex_unlock_slowpath+0x2a/0x270 [ 4562.100189] nouveau_ttm_fault+0x69/0xb0 [nouveau] [ 4562.100356] __do_fault+0x32/0x150 [ 4562.100362] do_fault+0x7c/0x560 [ 4562.100369] __handle_mm_fault+0x800/0xc10 [ 4562.100382] handle_mm_fault+0x17c/0x3e0 [ 4562.100388] do_user_addr_fault+0x208/0x860 [ 4562.100395] exc_page_fault+0x7f/0x200 [ 4562.100402] asm_exc_page_fault+0x26/0x30 [ 4562.100412] RIP: 0033:0x9b9870 [ 4562.100419] Code: 85 a8 f7 ff ff 8b 8d 80 f7 ff ff 89 08 e9 18 f2 ff ff 0f 1f 84 00 00 00 00 00 44 89 32 e9 90 fa ff ff 0f 1f 84 00 00 00 00 00 <44> 89 32 e9 f8 f1 ff ff 0f 1f 84 00 00 00 00 00 66 44 89 32 e9 e7 [ 4562.100422] RSP: 002b:00007fff9ba2dc70 EFLAGS: 00010246 [ 4562.100426] RAX: 0000000000000004 RBX: 000000000dd65e10 RCX: 000000fff0000000 [ 4562.100428] RDX: 00007f629a882000 RSI: 00007f629a882000 RDI: 0000000000000066 [ 4562.100432] RBP: 00007fff9ba2e570 R08: 0000000000000000 R09: 0000000123ddf000 [ 4562.100434] R10: 0000000000000001 R11: 0000000000000246 R12: 000000007fffffff [ 4562.100436] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 4562.100446] </TASK> [ 4562.100448] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink cmac bnep sunrpc iwlmvm intel_rapl_msr intel_rapl_common snd_sof_pci_intel_cnl x86_pkg_temp_thermal intel_powerclamp snd_sof_intel_hda_common mac80211 coretemp snd_soc_acpi_intel_match kvm_intel snd_soc_acpi snd_soc_hdac_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof_intel_hda_mlink snd_sof_intel_hda snd_sof kvm snd_sof_utils snd_soc_core snd_hda_codec_realtek libarc4 snd_hda_codec_generic snd_compress snd_hda_ext_core vfat fat snd_hda_intel snd_intel_dspcfg irqbypass iwlwifi snd_hda_codec snd_hwdep snd_hda_core btusb btrtl mei_hdcp iTCO_wdt rapl mei_pxp btintel snd_seq iTCO_vendor_support btbcm snd_seq_device intel_cstate bluetooth snd_pcm cfg80211 intel_wmi_thunderbolt wmi_bmof intel_uncore snd_timer mei_me snd ecdh_generic i2c_i801 [ 4562.100541] ecc mei i2c_smbus soundcore rfkill intel_pch_thermal acpi_pad zram nouveau drm_ttm_helper ttm gpu_sched i2c_algo_bit drm_gpuvm drm_exec mxm_wmi drm_display_helper drm_kms_helper drm crct10dif_pclmul crc32_pclmul nvme e1000e crc32c_intel nvme_core ghash_clmulni_intel video wmi pinctrl_cannonlake ip6_tables ip_tables fuse [ 4562.100616] ---[ end trace 0000000000000000 ]--- Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: stable@vger.kernel.org
* nouveau: use an rwlock for the event lock.Dave Airlie2023-11-141-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows it to break the following circular locking dependency. Aug 10 07:01:29 dg1test kernel: ====================================================== Aug 10 07:01:29 dg1test kernel: WARNING: possible circular locking dependency detected Aug 10 07:01:29 dg1test kernel: 6.4.0-rc7+ #10 Not tainted Aug 10 07:01:29 dg1test kernel: ------------------------------------------------------ Aug 10 07:01:29 dg1test kernel: wireplumber/2236 is trying to acquire lock: Aug 10 07:01:29 dg1test kernel: ffff8fca5320da18 (&fctx->lock){-...}-{2:2}, at: nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: but task is already holding lock: Aug 10 07:01:29 dg1test kernel: ffff8fca41208610 (&event->list_lock#2){-...}-{2:2}, at: nvkm_event_ntfy+0x50/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: which lock already depends on the new lock. Aug 10 07:01:29 dg1test kernel: the existing dependency chain (in reverse order) is: Aug 10 07:01:29 dg1test kernel: -> #3 (&event->list_lock#2){-...}-{2:2}: Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy+0x50/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: ga100_fifo_nonstall_intr+0x24/0x30 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_intr+0x12c/0x240 [nouveau] Aug 10 07:01:29 dg1test kernel: __handle_irq_event_percpu+0x88/0x240 Aug 10 07:01:29 dg1test kernel: handle_irq_event+0x38/0x80 Aug 10 07:01:29 dg1test kernel: handle_edge_irq+0xa3/0x240 Aug 10 07:01:29 dg1test kernel: __common_interrupt+0x72/0x160 Aug 10 07:01:29 dg1test kernel: common_interrupt+0x60/0xe0 Aug 10 07:01:29 dg1test kernel: asm_common_interrupt+0x26/0x40 Aug 10 07:01:29 dg1test kernel: -> #2 (&device->intr.lock){-...}-{2:2}: Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: nvkm_inth_allow+0x2c/0x80 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_state+0x181/0x250 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_allow+0x63/0xd0 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_uevent_mthd+0x4d/0x70 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_ioctl+0x10b/0x250 [nouveau] Aug 10 07:01:29 dg1test kernel: nvif_object_mthd+0xa8/0x1f0 [nouveau] Aug 10 07:01:29 dg1test kernel: nvif_event_allow+0x2a/0xa0 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_fence_enable_signaling+0x78/0x80 [nouveau] Aug 10 07:01:29 dg1test kernel: __dma_fence_enable_signaling+0x5e/0x100 Aug 10 07:01:29 dg1test kernel: dma_fence_add_callback+0x4b/0xd0 Aug 10 07:01:29 dg1test kernel: nouveau_cli_work_queue+0xae/0x110 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_gem_object_close+0x1d1/0x2a0 [nouveau] Aug 10 07:01:29 dg1test kernel: drm_gem_handle_delete+0x70/0xe0 [drm] Aug 10 07:01:29 dg1test kernel: drm_ioctl_kernel+0xa5/0x150 [drm] Aug 10 07:01:29 dg1test kernel: drm_ioctl+0x256/0x490 [drm] Aug 10 07:01:29 dg1test kernel: nouveau_drm_ioctl+0x5a/0xb0 [nouveau] Aug 10 07:01:29 dg1test kernel: __x64_sys_ioctl+0x91/0xd0 Aug 10 07:01:29 dg1test kernel: do_syscall_64+0x3c/0x90 Aug 10 07:01:29 dg1test kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc Aug 10 07:01:29 dg1test kernel: -> #1 (&event->refs_lock#4){....}-{2:2}: Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_state+0x37/0x250 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_allow+0x63/0xd0 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_uevent_mthd+0x4d/0x70 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_ioctl+0x10b/0x250 [nouveau] Aug 10 07:01:29 dg1test kernel: nvif_object_mthd+0xa8/0x1f0 [nouveau] Aug 10 07:01:29 dg1test kernel: nvif_event_allow+0x2a/0xa0 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_fence_enable_signaling+0x78/0x80 [nouveau] Aug 10 07:01:29 dg1test kernel: __dma_fence_enable_signaling+0x5e/0x100 Aug 10 07:01:29 dg1test kernel: dma_fence_add_callback+0x4b/0xd0 Aug 10 07:01:29 dg1test kernel: nouveau_cli_work_queue+0xae/0x110 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_gem_object_close+0x1d1/0x2a0 [nouveau] Aug 10 07:01:29 dg1test kernel: drm_gem_handle_delete+0x70/0xe0 [drm] Aug 10 07:01:29 dg1test kernel: drm_ioctl_kernel+0xa5/0x150 [drm] Aug 10 07:01:29 dg1test kernel: drm_ioctl+0x256/0x490 [drm] Aug 10 07:01:29 dg1test kernel: nouveau_drm_ioctl+0x5a/0xb0 [nouveau] Aug 10 07:01:29 dg1test kernel: __x64_sys_ioctl+0x91/0xd0 Aug 10 07:01:29 dg1test kernel: do_syscall_64+0x3c/0x90 Aug 10 07:01:29 dg1test kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc Aug 10 07:01:29 dg1test kernel: -> #0 (&fctx->lock){-...}-{2:2}: Aug 10 07:01:29 dg1test kernel: __lock_acquire+0x14e3/0x2240 Aug 10 07:01:29 dg1test kernel: lock_acquire+0xc8/0x2a0 Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_client_event+0xf/0x20 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy+0x9b/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: ga100_fifo_nonstall_intr+0x24/0x30 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_intr+0x12c/0x240 [nouveau] Aug 10 07:01:29 dg1test kernel: __handle_irq_event_percpu+0x88/0x240 Aug 10 07:01:29 dg1test kernel: handle_irq_event+0x38/0x80 Aug 10 07:01:29 dg1test kernel: handle_edge_irq+0xa3/0x240 Aug 10 07:01:29 dg1test kernel: __common_interrupt+0x72/0x160 Aug 10 07:01:29 dg1test kernel: common_interrupt+0x60/0xe0 Aug 10 07:01:29 dg1test kernel: asm_common_interrupt+0x26/0x40 Aug 10 07:01:29 dg1test kernel: other info that might help us debug this: Aug 10 07:01:29 dg1test kernel: Chain exists of: &fctx->lock --> &device->intr.lock --> &event->list_lock#2 Aug 10 07:01:29 dg1test kernel: Possible unsafe locking scenario: Aug 10 07:01:29 dg1test kernel: CPU0 CPU1 Aug 10 07:01:29 dg1test kernel: ---- ---- Aug 10 07:01:29 dg1test kernel: lock(&event->list_lock#2); Aug 10 07:01:29 dg1test kernel: lock(&device->intr.lock); Aug 10 07:01:29 dg1test kernel: lock(&event->list_lock#2); Aug 10 07:01:29 dg1test kernel: lock(&fctx->lock); Aug 10 07:01:29 dg1test kernel: *** DEADLOCK *** Aug 10 07:01:29 dg1test kernel: 2 locks held by wireplumber/2236: Aug 10 07:01:29 dg1test kernel: #0: ffff8fca53177bf8 (&device->intr.lock){-...}-{2:2}, at: nvkm_intr+0x29/0x240 [nouveau] Aug 10 07:01:29 dg1test kernel: #1: ffff8fca41208610 (&event->list_lock#2){-...}-{2:2}, at: nvkm_event_ntfy+0x50/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: stack backtrace: Aug 10 07:01:29 dg1test kernel: CPU: 6 PID: 2236 Comm: wireplumber Not tainted 6.4.0-rc7+ #10 Aug 10 07:01:29 dg1test kernel: Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 Aug 10 07:01:29 dg1test kernel: Call Trace: Aug 10 07:01:29 dg1test kernel: <TASK> Aug 10 07:01:29 dg1test kernel: dump_stack_lvl+0x5b/0x90 Aug 10 07:01:29 dg1test kernel: check_noncircular+0xe2/0x110 Aug 10 07:01:29 dg1test kernel: __lock_acquire+0x14e3/0x2240 Aug 10 07:01:29 dg1test kernel: lock_acquire+0xc8/0x2a0 Aug 10 07:01:29 dg1test kernel: ? nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: ? lock_acquire+0xc8/0x2a0 Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: ? nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_client_event+0xf/0x20 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy+0x9b/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: ga100_fifo_nonstall_intr+0x24/0x30 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_intr+0x12c/0x240 [nouveau] Aug 10 07:01:29 dg1test kernel: __handle_irq_event_percpu+0x88/0x240 Aug 10 07:01:29 dg1test kernel: handle_irq_event+0x38/0x80 Aug 10 07:01:29 dg1test kernel: handle_edge_irq+0xa3/0x240 Aug 10 07:01:29 dg1test kernel: __common_interrupt+0x72/0x160 Aug 10 07:01:29 dg1test kernel: common_interrupt+0x60/0xe0 Aug 10 07:01:29 dg1test kernel: asm_common_interrupt+0x26/0x40 Aug 10 07:01:29 dg1test kernel: RIP: 0033:0x7fb66174d700 Aug 10 07:01:29 dg1test kernel: Code: c1 e2 05 29 ca 8d 0c 10 0f be 07 84 c0 75 eb 89 c8 c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa e9 d7 0f fc ff 0f 1f 80 00 00 00 00 <f3> 0f 1e fa e9 c7 0f fc> Aug 10 07:01:29 dg1test kernel: RSP: 002b:00007ffdd3c48438 EFLAGS: 00000206 Aug 10 07:01:29 dg1test kernel: RAX: 000055bb758763c0 RBX: 000055bb758752c0 RCX: 00000000000028b0 Aug 10 07:01:29 dg1test kernel: RDX: 000055bb758752c0 RSI: 000055bb75887490 RDI: 000055bb75862950 Aug 10 07:01:29 dg1test kernel: RBP: 00007ffdd3c48490 R08: 000055bb75873b10 R09: 0000000000000001 Aug 10 07:01:29 dg1test kernel: R10: 0000000000000004 R11: 000055bb7587f000 R12: 000055bb75887490 Aug 10 07:01:29 dg1test kernel: R13: 000055bb757f6280 R14: 000055bb758875c0 R15: 000055bb757f6280 Aug 10 07:01:29 dg1test kernel: </TASK> Signed-off-by: Dave Airlie <airlied@redhat.com> Tested-by: Danilo Krummrich <dakr@redhat.com> Reviewed-by: Danilo Krummrich <dakr@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231107053255.2257079-1-airlied@gmail.com
* drm/nouveau/nvkm: support loading fws into sg_tableBen Skeggs2023-10-311-3/+71
| | | | | | | | | - preparation for GSP-RM, which has massive FW images - based on a patch by Dave Airlie Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-32-skeggsb@gmail.com
* drm/nouveau/imem: support allocations not preserved across suspendBen Skeggs2023-09-191-2/+13
| | | | | | | | | | | Will initially be used to tag some large grctx allocations which don't need to be saved, to speedup suspend/resume. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Acked-by: Danilo Krummrich <me@dakr.org> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230919220442.202488-3-lyude@redhat.com
* drm/nouveau/core: refactor deprecated strncpyJustin Stitt2023-09-151-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | `strncpy` is deprecated for use on NUL-terminated destination strings [1]. We should prefer more robust and less ambiguous string interfaces. A suitable replacement is `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. There is likely no bug in the current implementation due to the safeguard: | cname[sizeof(cname) - 1] = '\0'; ... however we can provide simpler and easier to understand code using the newer (and recommended) `strscpy` api. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Cc: linux-hardening@vger.kernel.org Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230914-strncpy-drivers-gpu-drm-nouveau-nvkm-core-firmware-c-v1-1-3aeae46c032f@google.com
* drm/nouveau/nvkm: punt spurious irq messages to debug levelBen Skeggs2023-07-061-2/+2
| | | | | | | | | | | This can be completely normal in some situations (ie. non-stall intrs when nothing is waiting on them). Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230525003106.3853741-2-skeggsb@gmail.com
* drm/nouveau/nvkm: fini object children in reverse orderBen Skeggs2023-07-061-1/+1
| | | | | | | | | | | | | | | | | Turns out, we're currently tearing down the disp core channel *before* the satellite channels (wndw, etc) during suspend. This makes RM return NV_ERR_NOT_SUPPORTED on attempting to reallocate the core channel on resume for some reason, but we probably shouldn't be doing it on HW either. Tear down children in the reverse of allocation order instead. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230525003106.3853741-1-skeggsb@gmail.com
* drm/nouveau/acr/gm20b: regression fixesBen Skeggs2023-01-301-0/+3
| | | | | | | | | | | Missed some Tegra-specific quirks when reworking ACR to support Ampere. Fixes: 2541626cfb79 ("drm/nouveau/acr: use common falcon HS FW code for ACR FWs") Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Tested-by: Nicolas Chauvet <kwizart@gmail.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230130223715.1831509-3-bskeggs@redhat.com
* drm/nouveau/engine: add HAL for engine-specific rc reset procedureBen Skeggs2022-11-091-0/+10
| | | | | | | Will be used to improve gr reset on GF100 and newer. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/acr: use common falcon HS FW code for ACR FWsBen Skeggs2022-11-091-1/+86
| | | | | | | | | | | Adds context binding and support for FWs with a bootloader to the code that was added to load VPR scrubber HS binaries, and ports ACR over to using all of it. - gv100 split from gp108 to handle FW exit status differences Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/flcn: new code to load+boot simple HS FWs (VPR scrubber)Ben Skeggs2022-11-091-0/+42
| | | | | | | | | | | | | | | | | | | Adds the start of common interfaces to load and boot the HS binaries provided by NVIDIA that enable the usage of GR. ACR already handles most of this, but it's very much tied into ACR's init process, and there's other code that could benefit from reusing a lot of this stuff too (ie. VBIOS DEVINIT/PreOS, VPR scrubber). The VPR scrubber code is fairly independent, and a good first target. - adds better debug output to fw loading process, to ease bring-up/debug v2: - whitespace, 0->false Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/fifo: add new engine context trackingBen Skeggs2022-11-091-1/+6
| | | | | | | | | | | | | Channel groups have somewhat more complicated requirements than what we currently support. An engine context is shared between all channels in a channel group, VEID/subctx support (later) brings per-VEID components, and we need to track an individual channel's engine context pointers. This commit adds the structures and refcounting to support the above, wrapping the prior implementation for the moment. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/fifo: expose runlist topology info on all chipsetsBen Skeggs2022-11-091-8/+4
| | | | | | | | | Previously only available from Kepler onwards. - also fixes the info() queries causing fifo init()/fini() unnecessarily Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/nvkm: add locking to subdev/engine init pathsBen Skeggs2022-11-092-60/+120
| | | | | | | | | | | | | | | | This wasn't really needed before; the main place this could race is with channel recovery, but (through potentially fragile means) shouldn't have been possible. However, a number of upcoming patches benefit from having better control over subdev init, necessitating some improvements here. - allows subdev/engine oneinit() without init() (host/fifo patches) - merges engine use locking/tracking into subdev, and extends it to fix some issues that will arise with future usage patterns (acr patches) Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/mc: implement intr handling on top of nvkm_intrBen Skeggs2022-11-091-9/+1
| | | | | | | | - new-style handlers can now be used here too - decent clean-up Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/intr: add nvkm_subdev_intr() compatibilityBen Skeggs2022-11-091-0/+61
| | | | | | | | | | | It's quite a lot of tedious and error-prone work to switch over all the subdevs at once, so allow an nvkm_intr to request new-style handlers to be created that wrap the existing interfaces. This will allow a more gradual transition. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/intr: support multiple trees, and explicit interfacesBen Skeggs2022-11-091-2/+282
| | | | | | | | | | | | | | | Turing adds a second top-level interrupt tree in HW, in addition to the trees available via NV_PMC. Most of the interrupts we care about are exposed in both trees, but not all of them, and we have some rather nasty hacks to route the fault buffer interrupts. Ampere removes the NV_PMC trees entirely. Here we add some infrastructure to be able to handle all of this more cleanly, as well as providing more explicit control over handlers. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/intr: add shared interrupt plumbing between pci/tegraBen Skeggs2022-11-092-0/+110
| | | | | | | | Unifies the handling between PCI-based and Tegra GPUs, and makes more explicit/obvious where device interrupts can be expected. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/nvkm: give each nvkm_event its own lockdep classBen Skeggs2022-11-091-4/+2
| | | | | | | | | | | | | | | | The vblank and nonstall events have some annoying interactions with DRM locking, and aren't able to do certain things as a result. However, other uses of event notifications don't have such requirements, and upcoming patches take advantage of this for various improvements. Having separate classes for each nvkm_event's spinlocks allows lockdep to distinguish between them and avoid false-positives. v2: __always_inline + comment Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/nvkm: rip out old notifyBen Skeggs2022-11-095-402/+10
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/nvkm: add a replacement for nvkm_notifyBen Skeggs2022-11-096-13/+339
| | | | | | | | | | | | | | | | This replaces the twisty, confusing, relationship between nvkm_event and nvkm_notify with something much simpler, and less racey. It also places events in the object tree hierarchy, which will allow a heap of the code tracking events across allocation/teardown/suspend to be removed. This commit just adds the new interfaces, and passes the owning subdev to the event constructor to enable debug-tracing in the new code. v2: - use ?: (lyude) Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/device: remove pwrsrc notify in favour of a direct call to clkBen Skeggs2022-07-131-0/+1
| | | | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
* drm/nouveau/nvkm: use list_add_tail() when building object treeBen Skeggs2022-07-131-1/+1
| | | | | | | | | | | | Fixes resume from hibernate failing on (at least) TU102, where cursor channel init failed due to being performed before the core channel. Not solid idea why suspend-to-ram worked, but, presumably HW being in an entirely clean state has something to do with it. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
* drm/nouveau/core/client: Mark nvkm_uclient_sclass with static keywordZou Wei2021-11-121-1/+1
| | | | | | | | | | | | Fix the following sparse warning: drivers/gpu/drm/nouveau/nvkm/core/client.c:64:1: warning: symbol 'nvkm_uclient_sclass' was not declared. Should it be static? Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://gitlab.freedesktop.org/drm/nouveau/-/merge_requests/10
* drm/nouveau: rip out nvkm_client.superBen Skeggs2021-08-181-3/+1
| | | | | | | | | No longer required now that userspace can't touch anything that might need it, and should fix DRM MM operations racing with each other, and the random hangs/crashes that come with that. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/nvkm: remove nvkm_subdev.indexBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/vic: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/sw: switch to instanced constructorBen Skeggs2021-02-112-7/+5
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/sec2: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/sec: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/pm: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/nvenc: switch to instanced constructorBen Skeggs2021-02-111-15/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/nvdec: switch to instanced constructorBen Skeggs2021-02-111-7/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/msvld: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/msppp: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/mspdec: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/msenc: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/mpeg: switch to instanced constructorBen Skeggs2021-02-112-3/+2
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/me: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/ifb: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/gr: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/fifo: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/dma: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/disp: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/cipher: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/ce: switch to instanced constructorBen Skeggs2021-02-112-16/+3
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/bsp,vp: switch to instanced constructorBen Skeggs2021-02-112-6/+3
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/volt: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/top: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
* drm/nouveau/tmr: switch to instanced constructorBen Skeggs2021-02-111-1/+0
| | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>