summaryrefslogtreecommitdiffstats
path: root/include/asm-generic
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'locking-core-2023-05-05' of ↵Linus Torvalds2023-05-052-1/+12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Introduce local{,64}_try_cmpxchg() - a slightly more optimal primitive, which will be used in perf events ring-buffer code - Simplify/modify rwsems on PREEMPT_RT, to address writer starvation - Misc cleanups/fixes * tag 'locking-core-2023-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/atomic: Correct (cmp)xchg() instrumentation locking/x86: Define arch_try_cmpxchg_local() locking/arch: Wire up local_try_cmpxchg() locking/generic: Wire up local{,64}_try_cmpxchg() locking/atomic: Add generic try_cmpxchg{,64}_local() support locking/rwbase: Mitigate indefinite writer starvation locking/arch: Rename all internal __xchg() names to __arch_xchg()
| * locking/generic: Wire up local{,64}_try_cmpxchg()Uros Bizjak2023-04-292-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement generic support for local{,64}_try_cmpxchg(). Redirect to the atomic_ family of functions when the target does not provide its own local.h definitions. For 64-bit targets, implement local64_try_cmpxchg and local64_cmpxchg using typed C wrappers that call local_ family of functions and provide additional checking of their input arguments. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230405141710.3551-3-ubizjak@gmail.com Cc: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'mm-stable-2023-04-27-15-30' of ↵Linus Torvalds2023-04-271-2/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of switching from a user process to a kernel thread. - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav. - zsmalloc performance improvements from Sergey Senozhatsky. - Yue Zhao has found and fixed some data race issues around the alteration of memcg userspace tunables. - VFS rationalizations from Christoph Hellwig: - removal of most of the callers of write_one_page() - make __filemap_get_folio()'s return value more useful - Luis Chamberlain has changed tmpfs so it no longer requires swap backing. Use `mount -o noswap'. - Qi Zheng has made the slab shrinkers operate locklessly, providing some scalability benefits. - Keith Busch has improved dmapool's performance, making part of its operations O(1) rather than O(n). - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd, permitting userspace to wr-protect anon memory unpopulated ptes. - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather than exclusive, and has fixed a bunch of errors which were caused by its unintuitive meaning. - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature, which causes minor faults to install a write-protected pte. - Vlastimil Babka has done some maintenance work on vma_merge(): cleanups to the kernel code and improvements to our userspace test harness. - Cleanups to do_fault_around() by Lorenzo Stoakes. - Mike Rapoport has moved a lot of initialization code out of various mm/ files and into mm/mm_init.c. - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for DRM, but DRM doesn't use it any more. - Lorenzo has also coverted read_kcore() and vread() to use iterators and has thereby removed the use of bounce buffers in some cases. - Lorenzo has also contributed further cleanups of vma_merge(). - Chaitanya Prakash provides some fixes to the mmap selftesting code. - Matthew Wilcox changes xfs and afs so they no longer take sleeping locks in ->map_page(), a step towards RCUification of pagefaults. - Suren Baghdasaryan has improved mmap_lock scalability by switching to per-VMA locking. - Frederic Weisbecker has reworked the percpu cache draining so that it no longer causes latency glitches on cpu isolated workloads. - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig logic. - Liu Shixin has changed zswap's initialization so we no longer waste a chunk of memory if zswap is not being used. - Yosry Ahmed has improved the performance of memcg statistics flushing. - David Stevens has fixed several issues involving khugepaged, userfaultfd and shmem. - Christoph Hellwig has provided some cleanup work to zram's IO-related code paths. - David Hildenbrand has fixed up some issues in the selftest code's testing of our pte state changing. - Pankaj Raghav has made page_endio() unneeded and has removed it. - Peter Xu contributed some rationalizations of the userfaultfd selftests. - Yosry Ahmed has fixed an issue around memcg's page recalim accounting. - Chaitanya Prakash has fixed some arm-related issues in the selftests/mm code. - Longlong Xia has improved the way in which KSM handles hwpoisoned pages. - Peter Xu fixes a few issues with uffd-wp at fork() time. - Stefan Roesch has changed KSM so that it may now be used on a per-process and per-cgroup basis. * tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm,unmap: avoid flushing TLB in batch if PTE is inaccessible shmem: restrict noswap option to initial user namespace mm/khugepaged: fix conflicting mods to collapse_file() sparse: remove unnecessary 0 values from rc mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area() hugetlb: pte_alloc_huge() to replace huge pte_alloc_map() maple_tree: fix allocation in mas_sparse_area() mm: do not increment pgfault stats when page fault handler retries zsmalloc: allow only one active pool compaction context selftests/mm: add new selftests for KSM mm: add new KSM process and sysfs knobs mm: add new api to enable ksm per process mm: shrinkers: fix debugfs file permissions mm: don't check VMA write permissions if the PTE/PMD indicates write permissions migrate_pages_batch: fix statistics for longterm pin retry userfaultfd: use helper function range_in_vma() lib/show_mem.c: use for_each_populated_zone() simplify code mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list() fs/buffer: convert create_page_buffers to folio_create_buffers fs/buffer: add folio_create_empty_buffers helper ...
| * | mm: prefer xxx_page() alloc/free functions for order-0 pagesLorenzo Stoakes2023-03-281-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update instances of alloc_pages(..., 0), __get_free_pages(..., 0) and __free_pages(..., 0) to use alloc_page(), __get_free_page() and __free_page() respectively in core code. Link: https://lkml.kernel.org/r/50c48ca4789f1da2a65795f2346f5ae3eff7d665.1678710232.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag 'hyperv-next-signed-20230424' of ↵Linus Torvalds2023-04-272-10/+40
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - PCI passthrough for Hyper-V confidential VMs (Michael Kelley) - Hyper-V VTL mode support (Saurabh Sengar) - Move panic report initialization code earlier (Long Li) - Various improvements and bug fixes (Dexuan Cui and Michael Kelley) * tag 'hyperv-next-signed-20230424' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (22 commits) PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg Drivers: hv: move panic report code from vmbus to hv early init code x86/hyperv: VTL support for Hyper-V Drivers: hv: Kconfig: Add HYPERV_VTL_MODE x86/hyperv: Make hv_get_nmi_reason public x86/hyperv: Add VTL specific structs and hypercalls x86/init: Make get/set_rtc_noop() public x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes x86/hyperv: Add callback filter to cpumask_to_vpset() Drivers: hv: vmbus: Remove the per-CPU post_msg_page clocksource: hyper-v: make sure Invariant-TSC is used if it is available PCI: hv: Enable PCI pass-thru devices in Confidential VMs Drivers: hv: Don't remap addresses that are above shared_gpa_boundary hv_netvsc: Remove second mapping of send and recv buffers Drivers: hv: vmbus: Remove second way of mapping ring buffers Drivers: hv: vmbus: Remove second mapping of VMBus monitor pages swiotlb: Remove bounce buffer remapping for Hyper-V Driver: VMBus: Add Devicetree support dt-bindings: bus: Add Hyper-V VMBus Drivers: hv: vmbus: Convert acpi_device to more generic platform_device ...
| * | x86/hyperv: Add VTL specific structs and hypercallsSaurabh Sengar2023-04-181-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add structs and hypercalls required to enable VTL support on x86. Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com> Link: https://lore.kernel.org/r/1681192532-15460-3-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
| * | x86/hyperv: Add callback filter to cpumask_to_vpset()Michael Kelley2023-04-171-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When copying CPUs from a Linux cpumask to a Hyper-V VPset, cpumask_to_vpset() currently has a "_noself" variant that doesn't copy the current CPU to the VPset. Generalize this variant by replacing it with a "_skip" variant having a callback function that is invoked for each CPU to decide if that CPU should be copied. Update the one caller of cpumask_to_vpset_noself() to use the new "_skip" variant instead. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1679922967-26582-2-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
| * | PCI: hv: Enable PCI pass-thru devices in Confidential VMsMichael Kelley2023-04-171-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For PCI pass-thru devices in a Confidential VM, Hyper-V requires that PCI config space be accessed via hypercalls. In normal VMs, config space accesses are trapped to the Hyper-V host and emulated. But in a confidential VM, the host can't access guest memory to decode the instruction for emulation, so an explicit hypercall must be used. Add functions to make the new MMIO read and MMIO write hypercalls. Update the PCI config space access functions to use the hypercalls when such use is indicated by Hyper-V flags. Also, set the flag to allow the Hyper-V PCI driver to be loaded and used in a Confidential VM (a.k.a., "Isolation VM"). The driver has previously been hardened against a malicious Hyper-V host[1]. [1] https://lore.kernel.org/all/20220511223207.3386-2-parri.andrea@gmail.com/ Co-developed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://lore.kernel.org/r/1679838727-87310-13-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
| * | hv_netvsc: Remove second mapping of send and recv buffersMichael Kelley2023-04-171-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With changes to how Hyper-V guest VMs flip memory between private (encrypted) and shared (decrypted), creating a second kernel virtual mapping for shared memory is no longer necessary. Everything needed for the transition to shared is handled by set_memory_decrypted(). As such, remove the code to create and manage the second mapping for the pre-allocated send and recv buffers. This mapping is the last user of hv_map_memory()/hv_unmap_memory(), so delete these functions as well. Finally, hv_map_memory() is the last user of vmap_pfn() in Hyper-V guest code, so remove the Kconfig selection of VMAP_PFN. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Link: https://lore.kernel.org/r/1679838727-87310-11-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
| * | Merge remote-tracking branch 'tip/x86/sev' into hyperv-nextWei Liu2023-04-171-0/+2
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge the following 6 patches from tip/x86/sev, which are taken from Michael Kelley's series [0]. The rest of Michael's series depend on them. x86/hyperv: Change vTOM handling to use standard coco mechanisms init: Call mem_encrypt_init() after Hyper-V hypercall init is done x86/mm: Handle decryption/re-encryption of bss_decrypted consistently Drivers: hv: Explicitly request decrypted in vmap_pfn() calls x86/hyperv: Reorder code to facilitate future work x86/ioremap: Add hypervisor callback for private MMIO mapping in coco VM 0: https://lore.kernel.org/linux-hyperv/1679838727-87310-1-git-send-email-mikelley@microsoft.com/
* | \ \ Merge tag 'gpio-updates-for-v6.4' of ↵Linus Torvalds2023-04-251-147/+0
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio updates from Bartosz Golaszewski: "We have some new drivers, significant refactoring of existing intel platforms, lots of improvements all around, mass conversion to using immutable irqchips by drivers that had not been converted individually yet and some changes in the core library code. Summary: New drivers: - add a driver for the Loongson GPIO controller - add a driver for the fxl6408 I2C GPIO expander - add a GPIO module containing code common for Intel Elkhart Lake and Merrifield platforms - add a driver for the Intel Elkhart Lake platform reusing the code from the intel tangier library GPIOLIB core: - GPIO ACPI improvements - simplify gpiochip_add_data_with_keys() fwnode handling - cleanup header inclusions (remove unneeded ones, order the rest alphabetically) - remove duplicate code (reuse krealloc() instead of open-coding it, drop a duplicated check in gpiod_find_and_request()) - reshuffle the code to remove unnecessary forward declarations - coding style cleanups and improvements - add a helper for accessing device fwnodes - small updates in docs Driver improvements: - convert all remaining GPIO irqchip drivers to using immutable irqchips - drop unnecessary of_match_ptr() macro expansions - shrink the code in gpio-merrifield significantly by reusing the code from gpio-tangier + minor tweaks to the driver code - remove MODULE_LICENSE() from drivers that can only be built-in - add device-tree support to gpio-loongson1 - use new regmap features in gpio-104-dio-48e and gpio-pcie-idio-24 - minor tweaks and fixes to gpio-xra1403, gpio-sim, gpio-tegra194, gpio-omap, gpio-aspeed, gpio-raspberrypi-exp - shrink code in gpio-ich and gpio-pxa - Kconfig tweak for gpio-pmic-eic-sprd" * tag 'gpio-updates-for-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (99 commits) gpio: gpiolib: Simplify gpiochip_add_data_with_key() fwnode gpiolib: Add gpiochip_set_data() helper gpiolib: Move gpiochip_get_data() higher in the code gpiolib: Check array_info for NULL only once in gpiod_get_array() gpiolib: Replace open coded krealloc() gpiolib: acpi: Add a ignore wakeup quirk for Clevo NL5xNU gpiolib: acpi: Move ACPI device NULL check to acpi_get_driver_gpio_data() gpiolib: acpi: use the fwnode in acpi_gpiochip_find() gpio: mm-lantiq: Fix typo in the newly added header filename sh: mach-x3proto: Add missing #include <linux/gpio/driver.h> powerpc/40x: Add missing select OF_GPIO_MM_GPIOCHIP gpio: xlp: Convert to immutable irq_chip gpio: xilinx: Convert to immutable irq_chip gpio: xgs-iproc: Convert to immutable irq_chip gpio: visconti: Convert to immutable irq_chip gpio: tqmx86: Convert to immutable irq_chip gpio: thunderx: Convert to immutable irq_chip gpio: stmpe: Convert to immutable irq_chip gpio: siox: Convert to immutable irq_chip gpio: rda: Convert to immutable irq_chip ...
| * | | | gpiolib: remove asm-generic/gpio.hArnd Bergmann2023-03-061-146/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The asm-generic/gpio.h file is now always included when using gpiolib, so just move its contents into linux/gpio.h with a few minor simplifications. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
| * | | | gpiolib: Make the legacy <linux/gpio.h> consumer-onlyLinus Walleij2023-03-061-1/+0
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The legacy <linux/gpio.h> header was an all-inclusive header used by drivers and consumers alike. After eliminating the last users of the driver defines, we can drop the inclusion of the <linux/gpio/driver.h> header. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
* | | | Merge tag 'x86_sev_for_v6.4_rc1' of ↵Linus Torvalds2023-04-251-0/+2
|\ \ \ \ | |_|/ / |/| | / | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV updates from Borislav Petkov: - Add the necessary glue so that the kernel can run as a confidential SEV-SNP vTOM guest on Hyper-V. A vTOM guest basically splits the address space in two parts: encrypted and unencrypted. The use case being running unmodified guests on the Hyper-V confidential computing hypervisor - Double-buffer messages between the guest and the hardware PSP device so that no partial buffers are copied back'n'forth and thus potential message integrity and leak attacks are possible - Name the return value the sev-guest driver returns when the hw PSP device hasn't been called, explicitly - Cleanups * tag 'x86_sev_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/hyperv: Change vTOM handling to use standard coco mechanisms init: Call mem_encrypt_init() after Hyper-V hypercall init is done x86/mm: Handle decryption/re-encryption of bss_decrypted consistently Drivers: hv: Explicitly request decrypted in vmap_pfn() calls x86/hyperv: Reorder code to facilitate future work x86/ioremap: Add hypervisor callback for private MMIO mapping in coco VM x86/sev: Change snp_guest_issue_request()'s fw_err argument virt/coco/sev-guest: Double-buffer messages crypto: ccp: Get rid of __sev_platform_init_locked()'s local function pointer crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL
| * | x86/hyperv: Change vTOM handling to use standard coco mechanismsMichael Kelley2023-03-271-0/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hyper-V guests on AMD SEV-SNP hardware have the option of using the "virtual Top Of Memory" (vTOM) feature specified by the SEV-SNP architecture. With vTOM, shared vs. private memory accesses are controlled by splitting the guest physical address space into two halves. vTOM is the dividing line where the uppermost bit of the physical address space is set; e.g., with 47 bits of guest physical address space, vTOM is 0x400000000000 (bit 46 is set). Guest physical memory is accessible at two parallel physical addresses -- one below vTOM and one above vTOM. Accesses below vTOM are private (encrypted) while accesses above vTOM are shared (decrypted). In this sense, vTOM is like the GPA.SHARED bit in Intel TDX. Support for Hyper-V guests using vTOM was added to the Linux kernel in two patch sets[1][2]. This support treats the vTOM bit as part of the physical address. For accessing shared (decrypted) memory, these patch sets create a second kernel virtual mapping that maps to physical addresses above vTOM. A better approach is to treat the vTOM bit as a protection flag, not as part of the physical address. This new approach is like the approach for the GPA.SHARED bit in Intel TDX. Rather than creating a second kernel virtual mapping, the existing mapping is updated using recently added coco mechanisms. When memory is changed between private and shared using set_memory_decrypted() and set_memory_encrypted(), the PTEs for the existing kernel mapping are changed to add or remove the vTOM bit in the guest physical address, just as with TDX. The hypercalls to change the memory status on the host side are made using the existing callback mechanism. Everything just works, with a minor tweak to map the IO-APIC to use private accesses. To accomplish the switch in approach, the following must be done: * Update Hyper-V initialization to set the cc_mask based on vTOM and do other coco initialization. * Update physical_mask so the vTOM bit is no longer treated as part of the physical address * Remove CC_VENDOR_HYPERV and merge the associated vTOM functionality under CC_VENDOR_AMD. Update cc_mkenc() and cc_mkdec() to set/clear the vTOM bit as a protection flag. * Code already exists to make hypercalls to inform Hyper-V about pages changing between shared and private. Update this code to run as a callback from __set_memory_enc_pgtable(). * Remove the Hyper-V special case from __set_memory_enc_dec() * Remove the Hyper-V specific call to swiotlb_update_mem_attributes() since mem_encrypt_init() will now do it. * Add a Hyper-V specific implementation of the is_private_mmio() callback that returns true for the IO-APIC and vTPM MMIO addresses [1] https://lore.kernel.org/all/20211025122116.264793-1-ltykernel@gmail.com/ [2] https://lore.kernel.org/all/20211213071407.314309-1-ltykernel@gmail.com/ [ bp: Touchups. ] Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/1679838727-87310-7-git-send-email-mikelley@microsoft.com
* | asm-generic: avoid __generic_cmpxchg_local warningsArnd Bergmann2023-04-043-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Code that passes a 32-bit constant into cmpxchg() produces a harmless sparse warning because of the truncation in the branch that is not taken: fs/erofs/zdata.c: note: in included file (through /home/arnd/arm-soc/arch/arm/include/asm/cmpxchg.h, /home/arnd/arm-soc/arch/arm/include/asm/atomic.h, /home/arnd/arm-soc/include/linux/atomic.h, ...): include/asm-generic/cmpxchg-local.h:29:33: warning: cast truncates bits from constant value (5f0ecafe becomes fe) include/asm-generic/cmpxchg-local.h:33:34: warning: cast truncates bits from constant value (5f0ecafe becomes cafe) include/asm-generic/cmpxchg-local.h:29:33: warning: cast truncates bits from constant value (5f0ecafe becomes fe) include/asm-generic/cmpxchg-local.h:30:42: warning: cast truncates bits from constant value (5f0edead becomes ad) include/asm-generic/cmpxchg-local.h:33:34: warning: cast truncates bits from constant value (5f0ecafe becomes cafe) include/asm-generic/cmpxchg-local.h:34:44: warning: cast truncates bits from constant value (5f0edead becomes dead) This was reported as a regression to Matt's recent __generic_cmpxchg_local patch, though this patch only added more warnings on top of the ones that were already there. Rewording the truncation to use an explicit bitmask instead of a cast to a smaller type avoids the warning but otherwise leaves the code unchanged. I had another look at why the cast is even needed for atomic_cmpxchg(), and as Matt describes the problem here is that atomic_t contains a signed 'int', but cmpxchg() takes an 'unsigned long' argument, and converting between the two leads to a 64-bit sign-extension of negative 32-bit atomics. I checked the other implementations of arch_cmpxchg() and did not find any others that run into the same problem as __generic_cmpxchg_local(), but it's easy to be on the safe side here and always convert the signed int into an unsigned int when calling arch_cmpxchg(), as this will work even when any of the arch_cmpxchg() implementations run into the same problem. Fixes: 624654152284 ("locking/atomic: cmpxchg: Make __generic_cmpxchg_local compare against zero-extended 'old' value") Reviewed-by: Matt Evans <mev@rivosinc.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* | asm-generic/io.h: suppress endianness warnings for relaxed accessorsVladimir Oltean2023-04-041-6/+6
| | | | | | | | | | | | | | | | | | Copy the forced type casts from the normal MMIO accessors to suppress the sparse warnings that point out __raw_readl() returns a native endian word (just like readl()). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* | asm-generic/io.h: suppress endianness warnings for readq() and writeq()Vladimir Oltean2023-04-041-2/+2
|/ | | | | | | | | | | | | | Commit c1d55d50139b ("asm-generic/io.h: Fix sparse warnings on big-endian architectures") missed fixing the 64-bit accessors. Arnd explains in the attached link why the casts are necessary, even if __raw_readq() and __raw_writeq() do not take endian-specific types. Link: https://lore.kernel.org/lkml/9105d6fc-880b-4734-857d-e3d30b87ccf6@app.fastmail.com/ Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* Merge tag 'driver-core-6.3-rc1' of ↵Linus Torvalds2023-02-241-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the large set of driver core changes for 6.3-rc1. There's a lot of changes this development cycle, most of the work falls into two different categories: - fw_devlink fixes and updates. This has gone through numerous review cycles and lots of review and testing by lots of different devices. Hopefully all should be good now, and Saravana will be keeping a watch for any potential regression on odd embedded systems. - driver core changes to work to make struct bus_type able to be moved into read-only memory (i.e. const) The recent work with Rust has pointed out a number of areas in the driver core where we are passing around and working with structures that really do not have to be dynamic at all, and they should be able to be read-only making things safer overall. This is the contuation of that work (started last release with kobject changes) in moving struct bus_type to be constant. We didn't quite make it for this release, but the remaining patches will be finished up for the release after this one, but the groundwork has been laid for this effort. Other than that we have in here: - debugfs memory leak fixes in some subsystems - error path cleanups and fixes for some never-able-to-be-hit codepaths. - cacheinfo rework and fixes - Other tiny fixes, full details are in the shortlog All of these have been in linux-next for a while with no reported problems" [ Geert Uytterhoeven points out that that last sentence isn't true, and that there's a pending report that has a fix that is queued up - Linus ] * tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (124 commits) debugfs: drop inline constant formatting for ERR_PTR(-ERROR) OPP: fix error checking in opp_migrate_dentry() debugfs: update comment of debugfs_rename() i3c: fix device.h kernel-doc warnings dma-mapping: no need to pass a bus_type into get_arch_dma_ops() driver core: class: move EXPORT_SYMBOL_GPL() lines to the correct place Revert "driver core: add error handling for devtmpfs_create_node()" Revert "devtmpfs: add debug info to handle()" Revert "devtmpfs: remove return value of devtmpfs_delete_node()" driver core: cpu: don't hand-override the uevent bus_type callback. devtmpfs: remove return value of devtmpfs_delete_node() devtmpfs: add debug info to handle() driver core: add error handling for devtmpfs_create_node() driver core: bus: update my copyright notice driver core: bus: add bus_get_dev_root() function driver core: bus: constify bus_unregister() driver core: bus: constify some internal functions driver core: bus: constify bus_get_kset() driver core: bus: constify bus_register/unregister_notifier() driver core: remove private pointer from struct bus_type ...
| * dma-mapping: no need to pass a bus_type into get_arch_dma_ops()Greg Kroah-Hartman2023-02-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The get_arch_dma_ops() arch-specific function never does anything with the struct bus_type that is passed into it, so remove it entirely as it is not needed. Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: linux-alpha@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: iommu@lists.linux.dev Cc: linux-arch@vger.kernel.org Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230214140121.131859-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge tag 'mm-nonmm-stable-2023-02-20-15-29' of ↵Linus Torvalds2023-02-231-3/+4
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "There is no particular theme here - mainly quick hits all over the tree. Most notable is a set of zlib changes from Mikhail Zaslonko which enhances and fixes zlib's use of S390 hardware support: 'lib/zlib: Set of s390 DFLTCC related patches for kernel zlib'" * tag 'mm-nonmm-stable-2023-02-20-15-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (55 commits) Update CREDITS file entry for Jesper Juhl sparc: allow PM configs for sparc32 COMPILE_TEST hung_task: print message when hung_task_warnings gets down to zero. arch/Kconfig: fix indentation scripts/tags.sh: fix the Kconfig tags generation when using latest ctags nilfs2: prevent WARNING in nilfs_dat_commit_end() lib/zlib: remove redundation assignement of avail_in dfltcc_gdht() lib/Kconfig.debug: do not enable DEBUG_PREEMPT by default lib/zlib: DFLTCC always switch to software inflate for Z_PACKET_FLUSH option lib/zlib: DFLTCC support inflate with small window lib/zlib: Split deflate and inflate states for DFLTCC lib/zlib: DFLTCC not writing header bits when avail_out == 0 lib/zlib: fix DFLTCC ignoring flush modes when avail_in == 0 lib/zlib: fix DFLTCC not flushing EOBS when creating raw streams lib/zlib: implement switching between DFLTCC and software lib/zlib: adjust offset calculation for dfltcc_state nilfs2: replace WARN_ONs for invalid DAT metadata block requests scripts/spelling.txt: add "exsits" pattern and fix typo instances fs: gracefully handle ->get_block not mapping bh in __mpage_writepage cramfs: Kconfig: fix spelling & punctuation ...
| * | docs: fault-injection: add requirements of error injectable functionsMasami Hiramatsu (Google)2023-02-021-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a section about the requirements of the error injectable functions and the type of errors. Since this section must be read before using ALLOW_ERROR_INJECTION() macro, that section is referred from the comment of the macro too. Link: https://lkml.kernel.org/r/167081321427.387937.15475445689482551048.stgit@devnote3 Link: https://lore.kernel.org/all/20221211115218.2e6e289bb85f8cf53c11aa97@kernel.org/T/#u Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Chris Mason <clm@meta.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Florent Revest <revest@chromium.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | error-injection: remove EI_ETYPE_NONEMasami Hiramatsu (Google)2023-02-021-1/+0
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "error-injection: Clarify the requirements of error injectable functions". Patches for clarifying the requirement of error injectable functions and to remove the confusing EI_ETYPE_NONE. This patch (of 2): Since the EI_ETYPE_NONE is confusing type, replace it with appropriate errno. The EI_ETYPE_NONE has been introduced for a dummy (error) value, but it can mislead people that they can use ALLOW_ERROR_INJECTION(func, NONE). So remove it from the EI_ETYPE and use appropriate errno instead. [akpm@linux-foundation.org: include/linux/error-injection.h needs errno.h] Link: https://lkml.kernel.org/r/167081319306.387937.10079195394503045678.stgit@devnote3 Link: https://lkml.kernel.org/r/167081320421.387937.4259807348852421112.stgit@devnote3 Fixes: 663faf9f7bee ("error-injection: Add injectable error types") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Chris Mason <clm@meta.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Florent Revest <revest@chromium.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag 'mm-stable-2023-02-20-13-37' of ↵Linus Torvalds2023-02-233-10/+20
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Daniel Verkamp has contributed a memfd series ("mm/memfd: add F_SEAL_EXEC") which permits the setting of the memfd execute bit at memfd creation time, with the option of sealing the state of the X bit. - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare") which addresses a rare race condition related to PMD unsharing. - Several folioification patch serieses from Matthew Wilcox, Vishal Moola, Sidhartha Kumar and Lorenzo Stoakes - Johannes Weiner has a series ("mm: push down lock_page_memcg()") which does perform some memcg maintenance and cleanup work. - SeongJae Park has added DAMOS filtering to DAMON, with the series "mm/damon/core: implement damos filter". These filters provide users with finer-grained control over DAMOS's actions. SeongJae has also done some DAMON cleanup work. - Kairui Song adds a series ("Clean up and fixes for swap"). - Vernon Yang contributed the series "Clean up and refinement for maple tree". - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It adds to MGLRU an LRU of memcgs, to improve the scalability of global reclaim. - David Hildenbrand has added some userfaultfd cleanup work in the series "mm: uffd-wp + change_protection() cleanups". - Christoph Hellwig has removed the generic_writepages() library function in the series "remove generic_writepages". - Baolin Wang has performed some maintenance on the compaction code in his series "Some small improvements for compaction". - Sidhartha Kumar is doing some maintenance work on struct page in his series "Get rid of tail page fields". - David Hildenbrand contributed some cleanup, bugfixing and generalization of pte management and of pte debugging in his series "mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs". - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation flag in the series "Discard __GFP_ATOMIC". - Sergey Senozhatsky has improved zsmalloc's memory utilization with his series "zsmalloc: make zspage chain size configurable". - Joey Gouly has added prctl() support for prohibiting the creation of writeable+executable mappings. The previous BPF-based approach had shortcomings. See "mm: In-kernel support for memory-deny-write-execute (MDWE)". - Waiman Long did some kmemleak cleanup and bugfixing in the series "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF". - T.J. Alumbaugh has contributed some MGLRU cleanup work in his series "mm: multi-gen LRU: improve". - Jiaqi Yan has provided some enhancements to our memory error statistics reporting, mainly by presenting the statistics on a per-node basis. See the series "Introduce per NUMA node memory error statistics". - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog regression in compaction via his series "Fix excessive CPU usage during compaction". - Christoph Hellwig does some vmalloc maintenance work in the series "cleanup vfree and vunmap". - Christoph Hellwig has removed block_device_operations.rw_page() in ths series "remove ->rw_page". - We get some maple_tree improvements and cleanups in Liam Howlett's series "VMA tree type safety and remove __vma_adjust()". - Suren Baghdasaryan has done some work on the maintainability of our vm_flags handling in the series "introduce vm_flags modifier functions". - Some pagemap cleanup and generalization work in Mike Rapoport's series "mm, arch: add generic implementation of pfn_valid() for FLATMEM" and "fixups for generic implementation of pfn_valid()" - Baoquan He has done some work to make /proc/vmallocinfo and /proc/kcore better represent the real state of things in his series "mm/vmalloc.c: allow vread() to read out vm_map_ram areas". - Jason Gunthorpe rationalized the GUP system's interface to the rest of the kernel in the series "Simplify the external interface for GUP". - SeongJae Park wishes to migrate people from DAMON's debugfs interface over to its sysfs interface. To support this, we'll temporarily be printing warnings when people use the debugfs interface. See the series "mm/damon: deprecate DAMON debugfs interface". - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes and clean-ups" series. - Huang Ying has provided a dramatic reduction in migration's TLB flush IPI rates with the series "migrate_pages(): batch TLB flushing". - Arnd Bergmann has some objtool fixups in "objtool warning fixes". * tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits) include/linux/migrate.h: remove unneeded externs mm/memory_hotplug: cleanup return value handing in do_migrate_range() mm/uffd: fix comment in handling pte markers mm: change to return bool for isolate_movable_page() mm: hugetlb: change to return bool for isolate_hugetlb() mm: change to return bool for isolate_lru_page() mm: change to return bool for folio_isolate_lru() objtool: add UACCESS exceptions for __tsan_volatile_read/write kmsan: disable ftrace in kmsan core code kasan: mark addr_has_metadata __always_inline mm: memcontrol: rename memcg_kmem_enabled() sh: initialize max_mapnr m68k/nommu: add missing definition of ARCH_PFN_OFFSET mm: percpu: fix incorrect size in pcpu_obj_full_size() maple_tree: reduce stack usage with gcc-9 and earlier mm: page_alloc: call panic() when memoryless node allocation fails mm: multi-gen LRU: avoid futile retries migrate_pages: move THP/hugetlb migration support check to simplify code migrate_pages: batch flushing TLB migrate_pages: share more code between _unmap and _move ...
| * | mm, arch: add generic implementation of pfn_valid() for FLATMEMMike Rapoport (IBM)2023-02-092-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Every architecture that supports FLATMEM memory model defines its own version of pfn_valid() that essentially compares a pfn to max_mapnr. Use mips/powerpc version implemented as static inline as a generic implementation of pfn_valid() and drop its per-architecture definitions. [rppt@kernel.org: fix the generic pfn_valid()] Link: https://lkml.kernel.org/r/Y9lg7R1Yd931C+y5@kernel.org Link: https://lkml.kernel.org/r/20230129124235.209895-5-rppt@kernel.org Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Huacai Chen <chenhuacai@loongson.cn> [LoongArch] Acked-by: Stafford Horne <shorne@gmail.com> [OpenRISC] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Reviewed-by: David Hildenbrand <david@redhat.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Cc: Brian Cain <bcain@quicinc.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | mm/uffd: always wr-protect pte in pte|pmd_mkuffd_wp()Peter Xu2023-01-181-8/+8
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a cleanup to always wr-protect pte/pmd in mkuffd_wp paths. The reasons I still think this patch is worthwhile, are: (1) It is a cleanup already; diffstat tells. (2) It just feels natural after I thought about this, if the pte is uffd protected, let's remove the write bit no matter what it was. (2) Since x86 is the only arch that supports uffd-wp, it also redefines pte|pmd_mkuffd_wp() in that it should always contain removals of write bits. It means any future arch that want to implement uffd-wp should naturally follow this rule too. It's good to make it a default, even if with vm_page_prot changes on VM_UFFD_WP. (3) It covers more than vm_page_prot. So no chance of any potential future "accident" (like pte_mkdirty() sparc64 or loongarch, even though it just got its pte_mkdirty fixed <1 month ago). It'll be fairly clear when reading the code too that we don't worry anything before a pte_mkuffd_wp() on uncertainty of the write bit. We may call pte_wrprotect() one more time in some paths (e.g. thp split), but that should be fully local bitop instruction so the overhead should be negligible. Although this patch should logically also fix all the known issues on uffd-wp too recently on page migration (not for numa hint recovery - that may need another explcit pte_wrprotect), but this is not the plan for that fix. So no fixes, and stable doesn't need this. Link: https://lkml.kernel.org/r/20221214201533.1774616-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ives van Hoorne <ives@codesandbox.io> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag 'gpio-updates-for-v6.3' of ↵Linus Torvalds2023-02-221-12/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio updates from Bartosz Golaszewski: "A rather small update, there are no new drivers, just improvements and refactoring in existing ones. Thanks to migrating of several drivers to using generalized APIs and dropping of OF interfaces in favor of using software nodes we're actually removing more code than we're adding. Core GPIOLIB: - drop several OF interfaces after moving a significant part of the code to using software nodes - remove more interfaces referring to the global GPIO numberspace that we're getting rid of - improvements in the gpio-regmap library - add helper for GPIO device reference counting - remove unused APIs - minor tweaks like sorting headers alphabetically Extended support in existing drivers: - add support for Tegra 234 PMC to gpio-tegra186 Driver improvements: - migrate the 104-dio/idi family of drivers to using the regmap-irq API - migrate gpio-i8255 and gpio-mm to the GPIO regmap API - clean-ups in gpio-pca953x - remove duplicate assignments of of_gpio_n_cells in gpio-davinci, gpio-ge, gpio-xilinx, gpio-zevio and gpio-wcd934x - improvements to gpio-pcf857x: implement get/set_multiple callbacks, use generic device properties instead of OF + minor tweaks - fix OF-related header includes and Kconfig dependencies in gpio-zevio - dynamically allocate the GPIO base in gpio-omap - use a dedicated printf specifier for printing fwnode info in gpio-sim - use dev_name() for the GPIO chip label in gpio-vf610 - other minor tweaks and fixes Documentation: - remove mentions of legacy API from comments in various places - convert the DT binding documents to YAML schema for Fujitsu MB86S7x, Unisoc GPIO and Unisoc EIC - document the Unisoc UMS512 controller in DT bindings" * tag 'gpio-updates-for-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (54 commits) gpio: sim: Use %pfwP specifier instead of calling fwnode API directly gpio: tegra186: remove unneeded loop in tegra186_gpio_init_route_mapping() gpiolib: of: Move enum of_gpio_flags to its only user gpio: mvebu: Use IS_REACHABLE instead of IS_ENABLED for CONFIG_PWM gpio: zevio: Add missing header gpio: Get rid of gpio_to_chip() gpio: pcf857x: Drop unneeded explicit casting gpio: pcf857x: Make use of device properties gpio: pcf857x: Get rid of legacy platform data gpio: rockchip: Do not mention legacy API in the code gpio: wcd934x: Remove duplicate assignment of of_gpio_n_cells gpio: zevio: Use proper headers and drop OF_GPIO dependency gpio: zevio: Remove duplicate assignment of of_gpio_n_cells gpio: xilinx: Remove duplicate assignment of of_gpio_n_cells dt-bindings: gpio: Add compatible string for Unisoc UMS512 dt-bindings: gpio: Convert Unisoc EIC controller binding to yaml dt-bindings: gpio: Convert Unisoc GPIO controller binding to yaml gpio: ge: Remove duplicate assignment of of_gpio_n_cells gpio: davinci: Remove duplicate assignment of of_gpio_n_cells gpio: omap: use dynamic allocation of base ...
| * | gpio: Get rid of gpio_to_chip()Linus Walleij2023-01-301-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gpio_to_chip() function refers to the global GPIO numberspace which is a problem we want to get rid of. Get this function out of the header and open code it into gpiolib with appropriate FIXME notices so no new users appear in the kernel. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
| * | gpio: Remove unused and obsoleted gpio_export_link()Andy Shevchenko2023-01-301-6/+0
| |/ | | | | | | | | | | | | | | | | gpio_export_link() is legacy and unused API, remove it for good. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Yanteng Si <siyanteng@loongson.cn> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
* | Merge tag 'hyperv-next-signed-20230220' of ↵Linus Torvalds2023-02-212-0/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - allow Linux to run as the nested root partition for Microsoft Hypervisor (Jinank Jain and Nuno Das Neves) - clean up the return type of callback functions (Dawei Li) * tag 'hyperv-next-signed-20230220' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Fix hv_get/set_register for nested bringup Drivers: hv: Make remove callback of hyperv driver void returned Drivers: hv: Enable vmbus driver for nested root partition x86/hyperv: Add an interface to do nested hypercalls Drivers: hv: Setup synic registers in case of nested root partition x86/hyperv: Add support for detecting nested hypervisor
| * | x86/hyperv: Add an interface to do nested hypercallsJinank Jain2023-01-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to TLFS, in order to communicate to L0 hypervisor there needs to be an additional bit set in the control register. This communication is required to perform privileged instructions which can only be performed by L0 hypervisor. An example of that could be setting up the VMBus infrastructure. Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/24f9d46d5259a688113e6e5e69e21002647f4949.1672639707.git.jinankjain@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
| * | x86/hyperv: Add support for detecting nested hypervisorJinank Jain2023-01-121-0/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | Detect if Linux is running as a nested hypervisor in the root partition for Microsoft Hypervisor, using flags provided by MSHV. Expose a new variable hv_nested that is used later for decisions specific to the nested use case. Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/8e3e7112806e81d2292a66a56fe547162754ecea.1672639707.git.jinankjain@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
* | Merge tag 'sched-core-2023-02-20' of ↵Linus Torvalds2023-02-201-6/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Improve the scalability of the CFS bandwidth unthrottling logic with large number of CPUs. - Fix & rework various cpuidle routines, simplify interaction with the generic scheduler code. Add __cpuidle methods as noinstr to objtool's noinstr detection and fix boatloads of cpuidle bugs & quirks. - Add new ABI: introduce MEMBARRIER_CMD_GET_REGISTRATIONS, to query previously issued registrations. - Limit scheduler slice duration to the sysctl_sched_latency period, to improve scheduling granularity with a large number of SCHED_IDLE tasks. - Debuggability enhancement on sys_exit(): warn about disabled IRQs, but also enable them to prevent a cascade of followup problems and repeat warnings. - Fix the rescheduling logic in prio_changed_dl(). - Micro-optimize cpufreq and sched-util methods. - Micro-optimize ttwu_runnable() - Micro-optimize the idle-scanning in update_numa_stats(), select_idle_capacity() and steal_cookie_task(). - Update the RSEQ code & self-tests - Constify various scheduler methods - Remove unused methods - Refine __init tags - Documentation updates - Misc other cleanups, fixes * tag 'sched-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (110 commits) sched/rt: pick_next_rt_entity(): check list_entry sched/deadline: Add more reschedule cases to prio_changed_dl() sched/fair: sanitize vruntime of entity being placed sched/fair: Remove capacity inversion detection sched/fair: unlink misfit task from cpu overutilized objtool: mem*() are not uaccess safe cpuidle: Fix poll_idle() noinstr annotation sched/clock: Make local_clock() noinstr sched/clock/x86: Mark sched_clock() noinstr x86/pvclock: Improve atomic update of last_value in pvclock_clocksource_read() x86/atomics: Always inline arch_atomic64*() cpuidle: tracing, preempt: Squash _rcuidle tracing cpuidle: tracing: Warn about !rcu_is_watching() cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG cpuidle: drivers: firmware: psci: Dont instrument suspend code KVM: selftests: Fix build of rseq test exit: Detect and fix irq disabled state in oops cpuidle, arm64: Fix the ARM64 cpuidle logic cpuidle: mvebu: Fix duplicate flags assignment sched/fair: Limit sched slice duration ...
| * | Merge tag 'v6.2-rc6' into sched/core, to pick up fixesIngo Molnar2023-01-311-0/+5
| |\| | | | | | | | | | | | | | | | Pick up fixes before merging another batch of cpuidle updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | objtool/idle: Validate __cpuidle code as noinstrPeter Zijlstra2023-01-131-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Idle code is very like entry code in that RCU isn't available. As such, add a little validation. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195540.373461409@infradead.org
* | | char/agp: introduce asm-generic/agp.hMike Rapoport2023-02-131-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are several architectures that duplicate definitions of map_page_into_agp(), unmap_page_from_agp() and flush_agp_cache(). Define those in asm-generic/agp.h and use it instead of duplicated per-architecture headers. Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* | | locking/atomic: cmpxchg: Make __generic_cmpxchg_local compare against ↵Matt Evans2023-02-031-3/+3
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | zero-extended 'old' value __generic_cmpxchg_local takes unsigned long old/new arguments which might end up being up-cast from smaller signed types (which will sign-extend). The loaded compare value must be compared against a truncated smaller type, so down-cast appropriately for each size. The issue is apparent on 64-bit machines with code, such as atomic_dec_unless_positive(), that sign-extends from int. 64-bit machines generally don't use the generic cmpxchg but development/early ports might make use of it, so make it correct. Signed-off-by: Matt Evans <mev@rivosinc.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* | arch: fix broken BuildID for arm64 and riscvMasahiro Yamada2022-12-301-0/+5
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dennis Gilmore reports that the BuildID is missing in the arm64 vmlinux since commit 994b7ac1697b ("arm64: remove special treatment for the link order of head.o"). The issue is that the type of .notes section, which contains the BuildID, changed from NOTES to PROGBITS. Ard Biesheuvel figured out that whichever object gets linked first gets to decide the type of a section. The PROGBITS type is the result of the compiler emitting .note.GNU-stack as PROGBITS rather than NOTE. While Ard provided a fix for arm64, I want to fix this globally because the same issue is happening on riscv since commit 2348e6bf4421 ("riscv: remove special treatment for the link order of head.o"). This problem will happen in general for other architectures if they start to drop unneeded entries from scripts/head-object-list.txt. Discard .note.GNU-stack in include/asm-generic/vmlinux.lds.h. Link: https://lore.kernel.org/lkml/CAABkxwuQoz1CTbyb57n0ZX65eSYiTonFCU8-LCQc=74D=xE=rA@mail.gmail.com/ Fixes: 994b7ac1697b ("arm64: remove special treatment for the link order of head.o") Fixes: 2348e6bf4421 ("riscv: remove special treatment for the link order of head.o") Reported-by: Dennis Gilmore <dennis@ausil.us> Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
* Merge tag 'asm-generic-6.2-1' of ↵Linus Torvalds2022-12-201-40/+40
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are only three fairly simple patches. The #include change to linux/swab.h addresses a userspace build issue, and the change to the mmio tracing logic helps provide more useful traces" * tag 'asm-generic-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: uapi: Add missing _UAPI prefix to <asm-generic/types.h> include guard asm-generic/io: Add _RET_IP_ to MMIO trace for more accurate debug info include/uapi/linux/swab: Fix potentially missing __always_inline
| * asm-generic/io: Add _RET_IP_ to MMIO trace for more accurate debug infoSai Prakash Ranjan2022-11-211-40/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to compiler optimizations like inlining, there are cases where MMIO traces using _THIS_IP_ for caller information might not be sufficient to provide accurate debug traces. 1) With optimizations (Seen with GCC): In this case, _THIS_IP_ works fine and prints the caller information since it will be inlined into the caller and we get the debug traces on who made the MMIO access, for ex: rwmmio_read: qcom_smmu_tlb_sync+0xe0/0x1b0 width=32 addr=0xffff8000087447f4 rwmmio_post_read: qcom_smmu_tlb_sync+0xe0/0x1b0 width=32 val=0x0 addr=0xffff8000087447f4 2) Without optimizations (Seen with Clang): _THIS_IP_ will not be sufficient in this case as it will print only the MMIO accessors itself which is of not much use since it is not inlined as below for example: rwmmio_read: readl+0x4/0x80 width=32 addr=0xffff8000087447f4 rwmmio_post_read: readl+0x48/0x80 width=32 val=0x4 addr=0xffff8000087447f4 So in order to handle this second case as well irrespective of the compiler optimizations, add _RET_IP_ to MMIO trace to make it provide more accurate debug information in all these scenarios. Before: rwmmio_read: readl+0x4/0x80 width=32 addr=0xffff8000087447f4 rwmmio_post_read: readl+0x48/0x80 width=32 val=0x4 addr=0xffff8000087447f4 After: rwmmio_read: qcom_smmu_tlb_sync+0xe0/0x1b0 -> readl+0x4/0x80 width=32 addr=0xffff8000087447f4 rwmmio_post_read: qcom_smmu_tlb_sync+0xe0/0x1b0 -> readl+0x4/0x80 width=32 val=0x0 addr=0xffff8000087447f4 Fixes: 210031971cdd ("asm-generic/io: Add logging support for MMIO accessors") Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* | Merge tag 'driver-core-6.2-rc1' of ↵Linus Torvalds2022-12-161-140/+94
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of driver core and kernfs changes for 6.2-rc1. The "big" change in here is the addition of a new macro, container_of_const() that will preserve the "const-ness" of a pointer passed into it. The "problem" of the current container_of() macro is that if you pass in a "const *", out of it can comes a non-const pointer unless you specifically ask for it. For many usages, we want to preserve the "const" attribute by using the same call. For a specific example, this series changes the kobj_to_dev() macro to use it, allowing it to be used no matter what the const value is. This prevents every subsystem from having to declare 2 different individual macros (i.e. kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce the const value at build time, which having 2 macros would not do either. The driver for all of this have been discussions with the Rust kernel developers as to how to properly mark driver core, and kobject, objects as being "non-mutable". The changes to the kobject and driver core in this pull request are the result of that, as there are lots of paths where kobjects and device pointers are not modified at all, so marking them as "const" allows the compiler to enforce this. So, a nice side affect of the Rust development effort has been already to clean up the driver core code to be more obvious about object rules. All of this has been bike-shedded in quite a lot of detail on lkml with different names and implementations resulting in the tiny version we have in here, much better than my original proposal. Lots of subsystem maintainers have acked the changes as well. Other than this change, included in here are smaller stuff like: - kernfs fixes and updates to handle lock contention better - vmlinux.lds.h fixes and updates - sysfs and debugfs documentation updates - device property updates All of these have been in the linux-next tree for quite a while with no problems" * tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (58 commits) device property: Fix documentation for fwnode_get_next_parent() firmware_loader: fix up to_fw_sysfs() to preserve const usb.h: take advantage of container_of_const() device.h: move kobj_to_dev() to use container_of_const() container_of: add container_of_const() that preserves const-ness of the pointer driver core: fix up missed drivers/s390/char/hmcdrv_dev.c class.devnode() conversion. driver core: fix up missed scsi/cxlflash class.devnode() conversion. driver core: fix up some missing class.devnode() conversions. driver core: make struct class.devnode() take a const * driver core: make struct class.dev_uevent() take a const * cacheinfo: Remove of_node_put() for fw_token device property: Add a blank line in Kconfig of tests device property: Rename goto label to be more precise device property: Move PROPERTY_ENTRY_BOOL() a bit down device property: Get rid of __PROPERTY_ENTRY_ARRAY_EL*SIZE*() kernfs: fix all kernel-doc warnings and multiple typos driver core: pass a const * into of_device_uevent() kobject: kset_uevent_ops: make name() callback take a const * kobject: kset_uevent_ops: make filter() callback take a const * kobject: make kobject_namespace take a const * ...
| * \ Merge 6.1-rc6 into driver-core-nextGreg Kroah-Hartman2022-11-213-8/+23
| |\ \ | | | | | | | | | | | | | | | | | | | | We need the kernfs changes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | vmlinux.lds.h: add HEADERED_SECTION_* macrosJim Cromie2022-11-171-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These macros elaborate on BOUNDED_SECTION_(PRE|POST)_LABEL macros, prepending an optional KEEP(.gnu.linkonce##_sec_) reservation, and a linker-symbol to address it. This allows a developer to define a header struct (which must fit with the section's base struct-type), and could contain: 1- fields whose value is common to the entire set of data-records. This allows the header & data structs to specialize, complement each other, and shrink. 2- an uplink pointer to an organizing struct which refs other related/sub data-tables header record is addressable via the extern'd header linker-symbol Once the linker-symbols created by the macro are ref'd extern in code, that code can compute a record's index (ptr - start) in the "primary" table, then use it to index into the related/sub tables. Adding a primary.map_* field foreach sub-table would then allow deduplication and remapping of that sub-table. This is aimed at dyndbg's struct _ddebug __dyndbg[] section, whose 3 columns: function, file, module are 50%, 90%, 100% redundant. The module column is fully recoverable after dynamic_debug_init() saves it to each ddebug_table.module as the builtin __dyndbg[] table is parsed. Given that those 3 columns use 24/56 of a _ddebug record, a dyndbg=y kernel with ~5k callsites could reduce kernel memory substantially. Returning that memory to the kernel buddy-allocator? is then possible. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20221117171633.923628-3-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | vmlinux.lds.h: fix BOUNDED_SECTION_(PRE|POST)_LABEL macrosJim Cromie2022-11-171-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2f465b921bb8 ("vmlinux.lds.h: place optional header space in BOUNDED_SECTION") added BOUNDED_SECTION_(PRE|POST)_LABEL macros, encapsulating the basic boilerplate to KEEP/pack records into a section, and to mark the begin and end of the section with linker-symbols. But it tried to do extra, adding KEEP(*(.gnu.linkonce.##_sec_)) to optionally reserve a header record in front of the data. It wrongly placed the KEEP after the linker-symbol starting the section, so if a header was added, it would wind up in the data. Moving the KEEP to the "correct" place proved brittle, and too clever by half. The obvious safe fix is to remove the KEEP and restore the plain old boilerplate. The header can be added later, with separate macros. Also, the macro var-names: _s_, _e_ are nearly invisible, change them to more obvious names: _BEGIN_, _END_ Fixes: 2f465b921bb8 ("vmlinux.lds.h: place optional header space in BOUNDED_SECTION") Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20221117171633.923628-2-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | vmlinux.lds.h: place optional header space in BOUNDED_SECTIONJim Cromie2022-11-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend recently added BOUNDED_SECTION(_name) macro by adding a KEEP(*(.gnu.linkonce.##_name)) before the KEEP(*(_name)). This does nothing by itself, vmlinux is the same before and after this patch. But if a developer adds a .gnu.linkonce.foo record, that record is placed in the front of the section, where it can be used as a header for the table. The intent is to create an up-link to another organizing struct, from where related tables can be referenced. And since every item in a table has a known offset from its header, that same offset can be used to fetch records from the related tables. By itself, this doesnt gain much, unless maybe the pattern of access is to scan 1 or 2 fields in each fat record, but with 2 16 bit .map* fields added, we could de-duplicate 2 related tables. The use case here is struct _ddebug, which has 3 pointers (function, file, module) with substantial repetition; respectively 53%, 90%, and the module column is fully recoverable after dynamic_debug_init() splits the table into a linked list of "module" chunks. On a DYNAMIC_DEBUG=y kernel with 5k pr_debugs, the memory savings should be ~100 KiB. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20221022225637.1406715-3-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | vmlinux.lds.h: add BOUNDED_SECTION* macrosJim Cromie2022-11-101-140/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vmlinux.lds.h has ~45 occurrences of this general pattern: __start_foo = .; KEEP(*(foo)) __stop_foo = .; Reduce this pattern to a (group of 4) macros, and use them to reduce linecount. This was inspired by the codetag patchset. no functional change. CC: Suren Baghdasaryan <surenb@google.com> CC: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20221022225637.1406715-2-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2022-12-152-5/+11
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm updates from Paolo Bonzini: "ARM64: - Enable the per-vcpu dirty-ring tracking mechanism, together with an option to keep the good old dirty log around for pages that are dirtied by something other than a vcpu. - Switch to the relaxed parallel fault handling, using RCU to delay page table reclaim and giving better performance under load. - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option, which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a97d: "Fix a number of issues with MTE, such as races on the tags being initialised vs the PG_mte_tagged flag as well as the lack of support for VM_SHARED when KVM is involved. Patches from Catalin Marinas and Peter Collingbourne"). - Merge the pKVM shadow vcpu state tracking that allows the hypervisor to have its own view of a vcpu, keeping that state private. - Add support for the PMUv3p5 architecture revision, bringing support for 64bit counters on systems that support it, and fix the no-quite-compliant CHAIN-ed counter support for the machines that actually exist out there. - Fix a handful of minor issues around 52bit VA/PA support (64kB pages only) as a prefix of the oncoming support for 4kB and 16kB pages. - Pick a small set of documentation and spelling fixes, because no good merge window would be complete without those. s390: - Second batch of the lazy destroy patches - First batch of KVM changes for kernel virtual != physical address support - Removal of a unused function x86: - Allow compiling out SMM support - Cleanup and documentation of SMM state save area format - Preserve interrupt shadow in SMM state save area - Respond to generic signals during slow page faults - Fixes and optimizations for the non-executable huge page errata fix. - Reprogram all performance counters on PMU filter change - Cleanups to Hyper-V emulation and tests - Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest running on top of a L1 Hyper-V hypervisor) - Advertise several new Intel features - x86 Xen-for-KVM: - Allow the Xen runstate information to cross a page boundary - Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured - Add support for 32-bit guests in SCHEDOP_poll - Notable x86 fixes and cleanups: - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. - Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. - Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. - Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. - Advertise (on AMD) that the SMM_CTL MSR is not supported - Remove unnecessary exports Generic: - Support for responding to signals during page faults; introduces new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks Selftests: - Fix an inverted check in the access tracking perf test, and restore support for asserting that there aren't too many idle pages when running on bare metal. - Fix build errors that occur in certain setups (unsure exactly what is unique about the problematic setup) due to glibc overriding static_assert() to a variant that requires a custom message. - Introduce actual atomics for clear/set_bit() in selftests - Add support for pinning vCPUs in dirty_log_perf_test. - Rename the so called "perf_util" framework to "memstress". - Add a lightweight psuedo RNG for guest use, and use it to randomize the access pattern and write vs. read percentage in the memstress tests. - Add a common ucall implementation; code dedup and pre-work for running SEV (and beyond) guests in selftests. - Provide a common constructor and arch hook, which will eventually be used by x86 to automatically select the right hypercall (AMD vs. Intel). - A bunch of added/enabled/fixed selftests for ARM64, covering memslots, breakpoints, stage-2 faults and access tracking. - x86-specific selftest changes: - Clean up x86's page table management. - Clean up and enhance the "smaller maxphyaddr" test, and add a related test to cover generic emulation failure. - Clean up the nEPT support checks. - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values. - Fix an ordering issue in the AMX test introduced by recent conversions to use kvm_cpu_has(), and harden the code to guard against similar bugs in the future. Anything that tiggers caching of KVM's supported CPUID, kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if the caching occurs before the test opts in via prctl(). Documentation: - Remove deleted ioctls from documentation - Clean up the docs for the x86 MSR filter. - Various fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits) KVM: x86: Add proper ReST tables for userspace MSR exits/flags KVM: selftests: Allocate ucall pool from MEM_REGION_DATA KVM: arm64: selftests: Align VA space allocator with TTBR0 KVM: arm64: Fix benign bug with incorrect use of VA_BITS KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow KVM: x86: Advertise that the SMM_CTL MSR is not supported KVM: x86: remove unnecessary exports KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic" tools: KVM: selftests: Convert clear/set_bit() to actual atomics tools: Drop "atomic_" prefix from atomic test_and_set_bit() tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests perf tools: Use dedicated non-atomic clear/set bit helpers tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers KVM: arm64: selftests: Enable single-step without a "full" ucall() KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself KVM: Remove stale comment about KVM_REQ_UNHALT KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR KVM: Reference to kvm_userspace_memory_region in doc and comments KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl ...
| * | | | x86/hyperv: Introduce HV_MAX_SPARSE_VCPU_BANKS/HV_VCPUS_PER_SPARSE_BANK ↵Vitaly Kuznetsov2022-11-182-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constants It may not come clear from where the magical '64' value used in __cpumask_to_vpset() come from. Moreover, '64' means both the maximum sparse bank number as well as the number of vCPUs per bank. Add defines to make things clear. These defines are also going to be used by KVM. No functional change. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-15-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | | | Merge tag 'gpio-updates-for-v6.2' of ↵Linus Torvalds2022-12-151-34/+21
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio updates from Bartosz Golaszewski: "We have a new GPIO multiplexer driver, bunch of driver updates and refactoring in the core GPIO library. GPIO core: - teach gpiolib to work with software nodes for HW description - remove ARCH_NR_GPIOS treewide as we no longer impose any limit on the number of GPIOS since the allocation became entirely dynamic - add support for HW quirks for Cirrus CS42L56 codec, Marvell NFC controller, Freescale PCIe and Ethernet controller, Himax LCDs and Mediatek mt2701 - refactor OF quirk code - some general refactoring of the OF and ACPI code, adding new helpers, minor tweaks and fixes, making fwnode usage consistent etc. GPIO uAPI: - fix an issue where the user-space can trigger a NULL-pointer dereference in the kernel by opening a device file, forcing a driver unbind and then calling one of the syscalls on the associated file descriptor New drivers: - add gpio-latch: a new GPIO multiplexer based on latches connected to other GPIOs Driver updates: - convert i2c GPIO expanders to using .probe_new() - drop the gpio-sta2x11 driver - factor out common code for the ACCES IDIO-16 family of controllers and use this new library wherever applicable in drivers - add DT support to gpio-hisi - allow building gpio-davinci as a module and increase its maxItems property - add support for a new model to gpio-pca9570 - other minor changes to various drivers" * tag 'gpio-updates-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (66 commits) gpio: sim: set a limit on the number of GPIOs gpiolib: protect the GPIO device against being dropped while in use by user-space gpiolib: cdev: fix NULL-pointer dereferences gpiolib: Provide to_gpio_device() helper gpiolib: Unify access to the device properties gpio: Do not include <linux/kernel.h> when not really needed. gpio: pcf857x: Convert to i2c's .probe_new() gpio: pca953x: Convert to i2c's .probe_new() gpio: max732x: Convert to i2c's .probe_new() dt-bindings: gpio: gpio-davinci: Increase maxItems in gpio-line-names gpiolib: ensure that fwnode is properly set gpio: sl28cpld: Replace irqchip mask_invert with unmask_base gpiolib: of: Use correct fwnode for DT-probed chips gpiolib: of: Drop redundant check in of_mm_gpiochip_remove() gpiolib: of: Prepare of_mm_gpiochip_add_data() for fwnode gpiolib: add support for software nodes gpiolib: consolidate GPIO lookups gpiolib: acpi: avoid leaking ACPI details into upper gpiolib layers gpiolib: acpi: teach acpi_find_gpio() to handle data-only nodes gpiolib: acpi: change acpi_find_gpio() to accept firmware node ...
| * | | | | gpiolib: Get rid of ARCH_NR_GPIOSChristophe Leroy2022-10-171-34/+21
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 14e85c0e69d5 ("gpio: remove gpio_descs global array") there is no limitation on the number of GPIOs that can be allocated in the system since the allocation is fully dynamic. ARCH_NR_GPIOS is today only used in order to provide downwards gpiobase allocation from that value, while static allocation is performed upwards from 0. However that has the disadvantage of limiting the number of GPIOs that can be registered in the system. To overcome this limitation without requiring each and every platform to provide its 'best-guess' maximum number, rework the allocation to allocate upwards, allowing approx 2 millions of GPIOs. In order to still allow static allocation for legacy drivers, define GPIO_DYNAMIC_BASE with the value 512 as the start for dynamic allocation. The 512 value is chosen because it is the end of the current default range so all current static allocations are expected to be below that value. Of course that's just a rough estimate based on the default value, but assuming static allocations come first, even if there are more static allocations it should fit under the 512 value. In the future, it is expected that all static allocations go away and then dynamic allocation will be patched to start at 0. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>