summaryrefslogtreecommitdiffstats
path: root/drivers/misc
Commit message (Collapse)AuthorAgeFilesLines
* misc/vmw_vmci: fix an infoleak in vmci_host_do_receive_datagram()Alexander Potapenko2022-11-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `struct vmci_event_qp` allocated by qp_notify_peer() contains padding, which may carry uninitialized data to the userspace, as observed by KMSAN: BUG: KMSAN: kernel-infoleak in instrument_copy_to_user ./include/linux/instrumented.h:121 instrument_copy_to_user ./include/linux/instrumented.h:121 _copy_to_user+0x5f/0xb0 lib/usercopy.c:33 copy_to_user ./include/linux/uaccess.h:169 vmci_host_do_receive_datagram drivers/misc/vmw_vmci/vmci_host.c:431 vmci_host_unlocked_ioctl+0x33d/0x43d0 drivers/misc/vmw_vmci/vmci_host.c:925 vfs_ioctl fs/ioctl.c:51 ... Uninit was stored to memory at: kmemdup+0x74/0xb0 mm/util.c:131 dg_dispatch_as_host drivers/misc/vmw_vmci/vmci_datagram.c:271 vmci_datagram_dispatch+0x4f8/0xfc0 drivers/misc/vmw_vmci/vmci_datagram.c:339 qp_notify_peer+0x19a/0x290 drivers/misc/vmw_vmci/vmci_queue_pair.c:1479 qp_broker_attach drivers/misc/vmw_vmci/vmci_queue_pair.c:1662 qp_broker_alloc+0x2977/0x2f30 drivers/misc/vmw_vmci/vmci_queue_pair.c:1750 vmci_qp_broker_alloc+0x96/0xd0 drivers/misc/vmw_vmci/vmci_queue_pair.c:1940 vmci_host_do_alloc_queuepair drivers/misc/vmw_vmci/vmci_host.c:488 vmci_host_unlocked_ioctl+0x24fd/0x43d0 drivers/misc/vmw_vmci/vmci_host.c:927 ... Local variable ev created at: qp_notify_peer+0x54/0x290 drivers/misc/vmw_vmci/vmci_queue_pair.c:1456 qp_broker_attach drivers/misc/vmw_vmci/vmci_queue_pair.c:1662 qp_broker_alloc+0x2977/0x2f30 drivers/misc/vmw_vmci/vmci_queue_pair.c:1750 Bytes 28-31 of 48 are uninitialized Memory access of size 48 starts at ffff888035155e00 Data copied to user address 0000000020000100 Use memset() to prevent the infoleaks. Also speculatively fix qp_notify_peer_local(), which may suffer from the same problem. Reported-by: syzbot+39be4da489ed2493ba25@syzkaller.appspotmail.com Cc: stable <stable@kernel.org> Fixes: 06164d2b72aa ("VMCI: queue pairs implementation.") Signed-off-by: Alexander Potapenko <glider@google.com> Reviewed-by: Vishnu Dasa <vdasa@vmware.com> Link: https://lore.kernel.org/r/20221104175849.2782567-1-glider@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* misc: sgi-gru: use explicitly signed charJason A. Donenfeld2022-10-252-10/+10
| | | | | | | | | | | | | | | With char becoming unsigned by default, and with `char` alone being ambiguous and based on architecture, signed chars need to be marked explicitly as such. This fixes warnings like: drivers/misc/sgi-gru/grumain.c:711 gru_check_chiplet_assignment() warn: 'gts->ts_user_chiplet_id' is unsigned Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://lore.kernel.org/r/20221025025223.573543-1-Jason@zx2c4.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* treewide: use get_random_u32() when possibleJason A. Donenfeld2022-10-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | The prandom_u32() function has been a deprecated inline wrapper around get_random_u32() for several releases now, and compiles down to the exact same code. Replace the deprecated wrapper with a direct call to the real function. The same also applies to get_random_int(), which is just a wrapper around get_random_u32(). This was done as a basic find and replace. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbolt Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Acked-by: Helge Deller <deller@gmx.de> # for parisc Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* Merge tag 'mm-stable-2022-10-08' of ↵Linus Torvalds2022-10-101-30/+15
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in linux-next for a couple of months without, to my knowledge, any negative reports (or any positive ones, come to that). - Also the Maple Tree from Liam Howlett. An overlapping range-based tree for vmas. It it apparently slightly more efficient in its own right, but is mainly targeted at enabling work to reduce mmap_lock contention. Liam has identified a number of other tree users in the kernel which could be beneficially onverted to mapletrees. Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat at [1]. This has yet to be addressed due to Liam's unfortunately timed vacation. He is now back and we'll get this fixed up. - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses clang-generated instrumentation to detect used-unintialized bugs down to the single bit level. KMSAN keeps finding bugs. New ones, as well as the legacy ones. - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of memory into THPs. - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support file/shmem-backed pages. - userfaultfd updates from Axel Rasmussen - zsmalloc cleanups from Alexey Romanov - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure - Huang Ying adds enhancements to NUMA balancing memory tiering mode's page promotion, with a new way of detecting hot pages. - memcg updates from Shakeel Butt: charging optimizations and reduced memory consumption. - memcg cleanups from Kairui Song. - memcg fixes and cleanups from Johannes Weiner. - Vishal Moola provides more folio conversions - Zhang Yi removed ll_rw_block() :( - migration enhancements from Peter Xu - migration error-path bugfixes from Huang Ying - Aneesh Kumar added ability for a device driver to alter the memory tiering promotion paths. For optimizations by PMEM drivers, DRM drivers, etc. - vma merging improvements from Jakub Matěn. - NUMA hinting cleanups from David Hildenbrand. - xu xin added aditional userspace visibility into KSM merging activity. - THP & KSM code consolidation from Qi Zheng. - more folio work from Matthew Wilcox. - KASAN updates from Andrey Konovalov. - DAMON cleanups from Kaixu Xia. - DAMON work from SeongJae Park: fixes, cleanups. - hugetlb sysfs cleanups from Muchun Song. - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core. Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1] * tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits) hugetlb: allocate vma lock for all sharable vmas hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer hugetlb: fix vma lock handling during split vma and range unmapping mglru: mm/vmscan.c: fix imprecise comments mm/mglru: don't sync disk for each aging cycle mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol mm: memcontrol: use do_memsw_account() in a few more places mm: memcontrol: deprecate swapaccounting=0 mode mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled mm/secretmem: remove reduntant return value mm/hugetlb: add available_huge_pages() func mm: remove unused inline functions from include/linux/mm_inline.h selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd selftests/vm: add thp collapse shmem testing selftests/vm: add thp collapse file and tmpfs testing selftests/vm: modularize thp collapse memory operations selftests/vm: dedup THP helpers mm/khugepaged: add tracepoint to hpage_collapse_scan_file() mm/madvise: add file and shmem support to MADV_COLLAPSE ...
| * cxl: remove vma linked list walkMatthew Wilcox (Oracle)2022-09-261-30/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the VMA iterator instead. This requires a little restructuring of the surrounding code to hoist the mm to the caller. That turns cxl_prefault_one() into a trivial function, so call cxl_fault_segment() directly. Link: https://lkml.kernel.org/r/20220906194824.2110408-38-Liam.Howlett@oracle.com Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag 'char-misc-6.1-rc1' of ↵Linus Torvalds2022-10-0848-801/+2335
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc and other driver updates from Greg KH: "Here is the large set of char/misc and other small driver subsystem changes for 6.1-rc1. Loads of different things in here: - IIO driver updates, additions, and changes. Probably the largest part of the diffstat - habanalabs driver update with support for new hardware and features, the second largest part of the diff. - fpga subsystem driver updates and additions - mhi subsystem updates - Coresight driver updates - gnss subsystem updates - extcon driver updates - icc subsystem updates - fsi subsystem updates - nvmem subsystem and driver updates - misc driver updates - speakup driver additions for new features - lots of tiny driver updates and cleanups All of these have been in the linux-next tree for a while with no reported issues" * tag 'char-misc-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (411 commits) w1: Split memcpy() of struct cn_msg flexible array spmi: pmic-arb: increase SPMI transaction timeout delay spmi: pmic-arb: block access for invalid PMIC arbiter v5 SPMI writes spmi: pmic-arb: correct duplicate APID to PPID mapping logic spmi: pmic-arb: add support to dispatch interrupt based on IRQ status spmi: pmic-arb: check apid against limits before calling irq handler spmi: pmic-arb: do not ack and clear peripheral interrupts in cleanup_irq spmi: pmic-arb: handle spurious interrupt spmi: pmic-arb: add a print in cleanup_irq drivers: spmi: Directly use ida_alloc()/free() MAINTAINERS: add TI ECAP driver info counter: ti-ecap-capture: capture driver support for ECAP Documentation: ABI: sysfs-bus-counter: add frequency & num_overflows items dt-bindings: counter: add ti,am62-ecap-capture.yaml counter: Introduce the COUNTER_COMP_ARRAY component type counter: Consolidate Counter extension sysfs attribute creation counter: Introduce the Count capture component counter: 104-quad-8: Add Signal polarity component counter: Introduce the Signal polarity component counter: interrupt-cnt: Implement watch_validate callback ...
| * | mei: gsc: Remove redundant dev_err callShang XiaoJing2022-09-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | devm_ioremap_resource() prints error message in itself. Remove the dev_err call to avoid redundant error message. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Link: https://lore.kernel.org/r/20220923100841.17719-1-shangxiaojing@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | mei: fix repeated words in commentsJilin Yuan2022-09-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Delete the redundant word 'from'. Acked-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com> Link: https://lore.kernel.org/r/20220918100431.28381-1-yuanjilin@cdjrlc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | drivers/misc/sgi-xp: Remove orphan declarations from drivers/misc/sgi-xp/xp.hGaosheng Cui2022-09-241-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the following orphan declarations from drivers/misc/sgi-xp/xp.h: 1. xp_nofault_PIOR_target 2. xp_error_PIOR 3. xp_nofault_PIOR They have been removed since commit 9726bfcdb977 ("misc/sgi-xp: remove SGI SN2 support"), so remove them. Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Link: https://lore.kernel.org/r/20220913110356.764711-1-cuigaosheng1@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | misc: microchip: pci1xxxx: Fix a memory leak in the error handling of ↵Christophe JAILLET2022-09-221-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gp_aux_bus_probe() 'aux_bus' is freed in the remove function but not in the error handling path of the probe. Use devm_kzalloc() to simplify the remove function and fix the leak in the probe. Fixes: 393fc2f5948f ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/17e19926669a1654e5f2495bf3b289581183d02e.1663482259.git.christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | misc: microchip: pci1xxxx: Do not disable the pci device twice in ↵Christophe JAILLET2022-09-221-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gp_aux_bus_remove() gp_aux_bus_probe() uses pcim_enable_device(), so there is no point in calling pci_disable_device() explicitly in the remove function. Fixes: 393fc2f5948f ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/8a3a385b3ae15ee7497469ec3250302b626a018b.1663482259.git.christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | misc: microchip: pci1xxxx: use DEFINE_SIMPLE_DEV_PM_OPS() in place of the ↵Kumaravel Thiagarajan2022-09-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SIMPLE_DEV_PM_OPS() in pci1xxxx's gpio driver build errors listed below and reported by Sudip Mukherjee <sudipm.mukherjee@gmail.com> for the builds of riscv, s390, csky, alpha and loongarch allmodconfig are fixed in this patch. drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gpio.c:311:12: error: 'pci1xxxx_gpio_resume' defined but not used [-Werror=unused-function] 311 | static int pci1xxxx_gpio_resume(struct device *dev) | ^~~~~~~~~~~~~~~~~~~~ drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gpio.c:295:12: error: 'pci1xxxx_gpio_suspend' defined but not used [-Werror=unused-function] 295 | static int pci1xxxx_gpio_suspend(struct device *dev) | ^~~~~~~~~~~~~~~~~~~~~ Fixes: 4ec7ac90ff39 ("misc: microchip: pci1xxxx: Add power management functions - suspend & resume handlers.") Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Kumaravel Thiagarajan <kumaravel.thiagarajan@microchip.com> Link: https://lore.kernel.org/r/20220915094729.646185-1-kumaravel.thiagarajan@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | misc: microchip: pci1xxxx: Remove duplicate includeYihao Han2022-09-221-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove duplicate include in mchp_pci1xxxx_gpio.c Fixes: 7d3e4d807df2 ("misc: microchip: pci1xxxx: load gpio driver for the gpio controller auxiliary device enumerated by the auxiliary bus driver.") Reviewed-by: Kumaravel Thiagarajan <kumaravel.thiagarajan@microchip.com> Signed-off-by: Yihao Han <hanyihao@vivo.com> Link: https://lore.kernel.org/r/20220913030257.22352-1-hanyihao@vivo.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | habanalabs: eliminate aggregate use warningOded Gabbay2022-09-205-10/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | When doing sizeof() and giving as argument a dereference of a pointer-to-a-pointer object, clang will issue a warning. Eliminate the warning by passing struct <name>* Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi: use 8KB aligned address for TPC kernelsTomer Tayar2022-09-201-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I$ prefetch is enabled when sending a TPC kernel to initialize the TPC memory, and it has a restriction that the base address will be aligned to 8KB. Currently the base address is 128 bytes from the start address of the device SRAM, so prefetching will start 128 bytes before the actual kernel memory. Modify the kernel address to be 8KB aligned. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: remove some f/w descriptor validationsfarah kassabri2022-09-201-29/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To be forward-backward compatible with the firmware in the initial communication during preboot, we need to remove the validation of the header size. This will allow us to add more fields to the lkd_fw_comms_desc structure. Instead of the validation of the header size, we just print warning when some mismatch in descriptor has been revealed, and we calculate the CRC base on descriptor size reported by the firmware instead of calculating it ourselves. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: build ASICs from new to oldOhad Sharabi2022-09-201-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Newer ASICs code changes more often, has more chance to fail compilation. So, let's compile them first so errors in those files will fail compilation sooner. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi2: allow user to flush PCIE by readOfir Bitton2022-09-193-1/+192
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order for the user to flush PCIE he needs to read some register from PCIE block. The chosen register is SPECIAL_GLBL_SPARE_0 and hence needs to be unsecured. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: failure to open device due to reset is debug levelOded Gabbay2022-09-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the user wants to open the device, and the device is currently in reset, the user will get an error from the open(). We don't need to display an error in the dmesg for that as it is not a real error and we can spam the kernel log with this message. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi2: Remove unnecessary (void*) conversionsLi zeming2022-09-191-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | The void pointer object can be directly assigned to different structure objects, it does not need to be cast. Signed-off-by: Li zeming <zeming@nfschina.com> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi2: add secured attestation info uapiDani Liberman2022-09-194-2/+176
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | User will provide a nonce via the ioctl, and will retrieve secured attestation data of the boot, generated using given nonce. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi2: add handling to pmmu events in eqe handlerDani Liberman2022-09-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | In order to get the error cause and the captured address in case of page fault, added pmmu events to eqe handler. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi: change TPC Assert to use TPC DEC instead of QMAN errTal Cohen2022-09-191-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change is done while there is a problem to use QMAN error for TPC assert async. The problem involves security limitation that exists to generate the assert via QMAN error. Signed-off-by: Tal Cohen <talcohen@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: rename error info structureDani Liberman2022-09-195-42/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | As a preparation for adding more errors to it, change to more suitable name. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi2: get f/w reset status register dynamicallyfarah kassabri2022-09-192-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | Get the firmware reset status address from the dynamic registers we read from the firmware instead of using a define. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi2: increase hard-reset sleep time to 2 secTomer Tayar2022-09-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The access to the device registers is blocked during hard reset, until preboot runs and allows the access to specific registers, including the PSOC BTM_FSM register which is used to know when the reset is done. Between the reset request and until this register is polled there is a small delay of 500 msec which is not enough for F/W to process the reset and for preboot to run, so the register might be accessed while it is blocked. To avoid it, increase the delay to 2 sec. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi2: print RAZWI info upon PCIe access errorTomer Tayar2022-09-192-4/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add the dump of the RAZWI information when a PCIe access is blocked by RR. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: MMU invalidation h/w is per deviceOded Gabbay2022-09-196-38/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code used the mmu mutex to protect access to the context's page tables and invalidation of the MMU cache. Because pgt are per context, the mmu mutex was a member of the context object. The problem is that the device has a single MMU invalidation h/w (per MMU). Therefore, the mmu mutex should not be a property of the context but a property of the device. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: new notifier events for device stateTal Cohen2022-09-192-6/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add new notifier events that inform several device states. General H/W error raised on device general H/W error occurs. User engine error is raised when a device engine informs of an error. Signed-off-by: Tal Cohen <talcohen@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi2: free event irq if init failsOded Gabbay2022-09-191-1/+5
| | | | | | | | | | | | | | | | | | | | | In case initialization fails after event irq was requested, we need to release that irq. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: fix resetting the DRAM BAROhad Sharabi2022-09-191-19/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | Current code does not takes into account the new DRAM region base and so calculated address is wrong and can lead to crush. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: add support for new cpucp return codesOfir Bitton2022-09-194-4/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | Firmware now responds with a more detailed cpucp return codes. Driver can now distinguish between error and debug return codes. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi2: read F/W security indication after hard resetTomer Tayar2022-09-192-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | F/W security status might change after every reset. Add the reading of the preboot status to the hard reset sequence, which among others reads this security indication. As this preboot status reading includes the waiting for the preboot to be ready, it can be removed from the CPU init which is done in a later stage. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi: rename mme cfg error response printOfir Bitton2022-09-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Current description is misleading hence we rename it to a more suitable error description. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: fix possible hole in device vafarah kassabri2022-09-192-86/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cb_map_mem() uses gen_pool_alloc() to get virtual address for mapping a CB. The mapping is done in chunks of page size, so if the CB size is larger, it is possible that the allocated virtual addresses won't be consecutive. User retrieves this device VA which returns the virtual address in the first va_block. If there is a "hole" in the virtual addresses, user can configure a HW block with a bad device VA. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: send device activity in a proper contextOfir Bitton2022-09-192-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'Device activity open packet' should be sent outside of mutex as there is no real necessity for a lock. In addition 'device activity close packet' should be sent upon an actual release of the device. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: send device active message to f/wfarah kassabri2022-09-199-0/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | As part of the RAS that is done by the f/w, we should send a message to the f/w when a user either acquires or releases the device. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi2: dump detailed information upon RAZWIOfir Bitton2022-09-191-51/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | | In order to improve debuggability, we add all available information when a RAZWI event occur. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi2: log critical events with no rate limitfarah kassabri2022-09-181-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we have a storm of errors of HBM ECC SERR we can reach a situation where driver start hard reset flow without logging the error cause that caused the hard reset due to logs rate limiting. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: ignore EEPROM errors during bootOfir Bitton2022-09-182-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | EEPROM errors reported by firmware are basically warnings and should not fail the boot process. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: perform context switch flow only if neededOfir Bitton2022-09-183-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | Except Goya, none of our ASICs require context switch flow, hence we enable this flow only where it is needed. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: set command buffer host VA dynamicallyDafna Hirschfeld2022-09-185-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set the addresses for userspace command buffer dynamically instead of hard-coded. There is no reason for it to be hard-coded. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: trace DMA allocationsOhad Sharabi2022-09-182-27/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch add tracepoints in the code for DMA allocation. The main purpose is to be able to cross data with the map operations and determine whether memory violation occurred, for example free DMA allocation before unmapping it from device memory. To achieve this the DMA alloc/free code flows were refactored so that a single DMA tracepoint will catch many flows. To get better understanding of what happened in the DMA allocations the real allocating function is added to the trace as well. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: trace MMU map/unmap pageOhad Sharabi2022-09-181-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch utilize the defined tracepoint to trace the MMU's pages map/unmap operations. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: define trace eventsOhad Sharabi2022-09-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds trace events for habanalabs driver to gain all the benefits such an infrastructure can supply. The following events were added: - MMU map/unmap: to be able to track driver's memory allocations - DMA alloc/free: to track our DMA allocation the above trace points in conjunction will help us map the device memory usage as well as to be able to track memory violations. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Acked-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs/gaudi2: assigning PQFs for ARC f/w in PDMARajarama Manjukody Bhat2022-09-182-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | Assigning 3 PQFs in PDMA1 and 2 PQFs in PDMA0 for ARC firmware usage. Signed-off-by: Rajarama Manjukody Bhat <rmbhat@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: fix calculation of DRAM base address in PCIe BARTomer Tayar2022-09-181-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The calculation of the device DRAM base address before setting the relevant PCIe BAR to point at it, has an assumption that this BAR is used to access only the DRAM, and thus the covered DRAM size is a power of 2. In future ASICs it is not necessarily true, so need to update the calculation to support also a non-power-of-2 size. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: if map page fails don't try to unmap itDafna Hirschfeld2022-09-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | The original code tried to unmap a page that was not mapped as part of the map page error path. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: select FW_LOADER in KconfigOded Gabbay2022-09-181-0/+1
| | | | | | | | | | | | | | | | | | | | | The driver is loading firmware to the device and we use the firmware loading functions from the FW_LOADER module. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| * | habanalabs: add cdev index data memberOmer Shpigelman2022-09-182-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of recalculating the cdev index, store it in a dedicated data member. This data member is intended to be passed to other drivers using the auxiliary bus infra and hence this new data member is necessary in case that the calculation is changed in the future. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>