summaryrefslogtreecommitdiffstats
path: root/arch/nios2
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'mm-stable-2023-04-27-15-30' of ↵Linus Torvalds2023-04-271-12/+10
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of switching from a user process to a kernel thread. - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav. - zsmalloc performance improvements from Sergey Senozhatsky. - Yue Zhao has found and fixed some data race issues around the alteration of memcg userspace tunables. - VFS rationalizations from Christoph Hellwig: - removal of most of the callers of write_one_page() - make __filemap_get_folio()'s return value more useful - Luis Chamberlain has changed tmpfs so it no longer requires swap backing. Use `mount -o noswap'. - Qi Zheng has made the slab shrinkers operate locklessly, providing some scalability benefits. - Keith Busch has improved dmapool's performance, making part of its operations O(1) rather than O(n). - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd, permitting userspace to wr-protect anon memory unpopulated ptes. - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather than exclusive, and has fixed a bunch of errors which were caused by its unintuitive meaning. - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature, which causes minor faults to install a write-protected pte. - Vlastimil Babka has done some maintenance work on vma_merge(): cleanups to the kernel code and improvements to our userspace test harness. - Cleanups to do_fault_around() by Lorenzo Stoakes. - Mike Rapoport has moved a lot of initialization code out of various mm/ files and into mm/mm_init.c. - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for DRM, but DRM doesn't use it any more. - Lorenzo has also coverted read_kcore() and vread() to use iterators and has thereby removed the use of bounce buffers in some cases. - Lorenzo has also contributed further cleanups of vma_merge(). - Chaitanya Prakash provides some fixes to the mmap selftesting code. - Matthew Wilcox changes xfs and afs so they no longer take sleeping locks in ->map_page(), a step towards RCUification of pagefaults. - Suren Baghdasaryan has improved mmap_lock scalability by switching to per-VMA locking. - Frederic Weisbecker has reworked the percpu cache draining so that it no longer causes latency glitches on cpu isolated workloads. - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig logic. - Liu Shixin has changed zswap's initialization so we no longer waste a chunk of memory if zswap is not being used. - Yosry Ahmed has improved the performance of memcg statistics flushing. - David Stevens has fixed several issues involving khugepaged, userfaultfd and shmem. - Christoph Hellwig has provided some cleanup work to zram's IO-related code paths. - David Hildenbrand has fixed up some issues in the selftest code's testing of our pte state changing. - Pankaj Raghav has made page_endio() unneeded and has removed it. - Peter Xu contributed some rationalizations of the userfaultfd selftests. - Yosry Ahmed has fixed an issue around memcg's page recalim accounting. - Chaitanya Prakash has fixed some arm-related issues in the selftests/mm code. - Longlong Xia has improved the way in which KSM handles hwpoisoned pages. - Peter Xu fixes a few issues with uffd-wp at fork() time. - Stefan Roesch has changed KSM so that it may now be used on a per-process and per-cgroup basis. * tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm,unmap: avoid flushing TLB in batch if PTE is inaccessible shmem: restrict noswap option to initial user namespace mm/khugepaged: fix conflicting mods to collapse_file() sparse: remove unnecessary 0 values from rc mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area() hugetlb: pte_alloc_huge() to replace huge pte_alloc_map() maple_tree: fix allocation in mas_sparse_area() mm: do not increment pgfault stats when page fault handler retries zsmalloc: allow only one active pool compaction context selftests/mm: add new selftests for KSM mm: add new KSM process and sysfs knobs mm: add new api to enable ksm per process mm: shrinkers: fix debugfs file permissions mm: don't check VMA write permissions if the PTE/PMD indicates write permissions migrate_pages_batch: fix statistics for longterm pin retry userfaultfd: use helper function range_in_vma() lib/show_mem.c: use for_each_populated_zone() simplify code mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list() fs/buffer: convert create_page_buffers to folio_create_buffers fs/buffer: add folio_create_empty_buffers helper ...
| * nios2: drop ranges for definition of ARCH_FORCE_MAX_ORDERMike Rapoport (IBM)2023-04-181-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nios2 defines range for ARCH_FORCE_MAX_ORDER allowing MAX_ORDER up to 19, which implies maximal contiguous allocation size of 2^19 pages or 2GiB. Drop bogus definition of ranges for ARCH_FORCE_MAX_ORDER and leave it a simple integer with sensible default. Users that *really* need to change the value of ARCH_FORCE_MAX_ORDER will be able to do so but they won't be mislead by the bogus ranges. Link: https://lkml.kernel.org/r/20230324052233.2654090-9-rppt@kernel.org Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rich Felker <dalias@libc.org> Cc: "Russell King (Oracle)" <linux@armlinux.org.uk> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * nios2: reword ARCH_FORCE_MAX_ORDER prompt and help textMike Rapoport (IBM)2023-04-181-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The prompt and help text of ARCH_FORCE_MAX_ORDER are not even close to describe this configuration option. Update both to actually describe what this option does. Link: https://lkml.kernel.org/r/20230324052233.2654090-8-rppt@kernel.org Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rich Felker <dalias@libc.org> Cc: "Russell King (Oracle)" <linux@armlinux.org.uk> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * mm, treewide: redefine MAX_ORDER sanelyKirill A. Shutemov2023-04-051-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MAX_ORDER currently defined as number of orders page allocator supports: user can ask buddy allocator for page order between 0 and MAX_ORDER-1. This definition is counter-intuitive and lead to number of bugs all over the kernel. Change the definition of MAX_ORDER to be inclusive: the range of orders user can ask from buddy allocator is 0..MAX_ORDER now. [kirill@shutemov.name: fix min() warning] Link: https://lkml.kernel.org/r/20230315153800.32wib3n5rickolvh@box [akpm@linux-foundation.org: fix another min_t warning] [kirill@shutemov.name: fixups per Zi Yan] Link: https://lkml.kernel.org/r/20230316232144.b7ic4cif4kjiabws@box.shutemov.name [akpm@linux-foundation.org: fix underlining in docs] Link: https://lore.kernel.org/oe-kbuild-all/202303191025.VRCTk6mP-lkp@intel.com/ Link: https://lkml.kernel.org/r/20230315113133.11326-11-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | nios2: _TIF_ALLWORK_MASK is unusedAl Viro2023-03-051-3/+0
|/ | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds2023-03-051-1/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull VM_FAULT_RETRY fixes from Al Viro: "Some of the page fault handlers do not deal with the following case correctly: - handle_mm_fault() has returned VM_FAULT_RETRY - there is a pending fatal signal - fault had happened in kernel mode Correct action in such case is not "return unconditionally" - fatal signals are handled only upon return to userland and something like copy_to_user() would end up retrying the faulting instruction and triggering the same fault again and again. What we need to do in such case is to make the caller to treat that as failed uaccess attempt - handle exception if there is an exception handler for faulting instruction or oops if there isn't one. Over the years some architectures had been fixed and now are handling that case properly; some still do not. This series should fix the remaining ones. Status: - m68k, riscv, hexagon, parisc: tested/acked by maintainers. - alpha, sparc32, sparc64: tested locally - bug has been reproduced on the unpatched kernel and verified to be fixed by this series. - ia64, microblaze, nios2, openrisc: build, but otherwise completely untested" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: openrisc: fix livelock in uaccess nios2: fix livelock in uaccess microblaze: fix livelock in uaccess ia64: fix livelock in uaccess sparc: fix livelock in uaccess alpha: fix livelock in uaccess parisc: fix livelock in uaccess hexagon: fix livelock in uaccess riscv: fix livelock in uaccess m68k: fix livelock in uaccess
| * nios2: fix livelock in uaccessAl Viro2023-03-021-1/+4
| | | | | | | | | | | | | | | | | | | | | | nios2 equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling" If e.g. get_user() triggers a page fault and a fatal signal is caught, we might end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything to page tables. In such case we must *not* return to the faulting insn - that would repeat the entire thing without making any progress; what we need instead is to treat that as failed (user) memory access. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge tag 'mm-stable-2023-02-20-13-37' of ↵Linus Torvalds2023-02-233-17/+32
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Daniel Verkamp has contributed a memfd series ("mm/memfd: add F_SEAL_EXEC") which permits the setting of the memfd execute bit at memfd creation time, with the option of sealing the state of the X bit. - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare") which addresses a rare race condition related to PMD unsharing. - Several folioification patch serieses from Matthew Wilcox, Vishal Moola, Sidhartha Kumar and Lorenzo Stoakes - Johannes Weiner has a series ("mm: push down lock_page_memcg()") which does perform some memcg maintenance and cleanup work. - SeongJae Park has added DAMOS filtering to DAMON, with the series "mm/damon/core: implement damos filter". These filters provide users with finer-grained control over DAMOS's actions. SeongJae has also done some DAMON cleanup work. - Kairui Song adds a series ("Clean up and fixes for swap"). - Vernon Yang contributed the series "Clean up and refinement for maple tree". - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It adds to MGLRU an LRU of memcgs, to improve the scalability of global reclaim. - David Hildenbrand has added some userfaultfd cleanup work in the series "mm: uffd-wp + change_protection() cleanups". - Christoph Hellwig has removed the generic_writepages() library function in the series "remove generic_writepages". - Baolin Wang has performed some maintenance on the compaction code in his series "Some small improvements for compaction". - Sidhartha Kumar is doing some maintenance work on struct page in his series "Get rid of tail page fields". - David Hildenbrand contributed some cleanup, bugfixing and generalization of pte management and of pte debugging in his series "mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs". - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation flag in the series "Discard __GFP_ATOMIC". - Sergey Senozhatsky has improved zsmalloc's memory utilization with his series "zsmalloc: make zspage chain size configurable". - Joey Gouly has added prctl() support for prohibiting the creation of writeable+executable mappings. The previous BPF-based approach had shortcomings. See "mm: In-kernel support for memory-deny-write-execute (MDWE)". - Waiman Long did some kmemleak cleanup and bugfixing in the series "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF". - T.J. Alumbaugh has contributed some MGLRU cleanup work in his series "mm: multi-gen LRU: improve". - Jiaqi Yan has provided some enhancements to our memory error statistics reporting, mainly by presenting the statistics on a per-node basis. See the series "Introduce per NUMA node memory error statistics". - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog regression in compaction via his series "Fix excessive CPU usage during compaction". - Christoph Hellwig does some vmalloc maintenance work in the series "cleanup vfree and vunmap". - Christoph Hellwig has removed block_device_operations.rw_page() in ths series "remove ->rw_page". - We get some maple_tree improvements and cleanups in Liam Howlett's series "VMA tree type safety and remove __vma_adjust()". - Suren Baghdasaryan has done some work on the maintainability of our vm_flags handling in the series "introduce vm_flags modifier functions". - Some pagemap cleanup and generalization work in Mike Rapoport's series "mm, arch: add generic implementation of pfn_valid() for FLATMEM" and "fixups for generic implementation of pfn_valid()" - Baoquan He has done some work to make /proc/vmallocinfo and /proc/kcore better represent the real state of things in his series "mm/vmalloc.c: allow vread() to read out vm_map_ram areas". - Jason Gunthorpe rationalized the GUP system's interface to the rest of the kernel in the series "Simplify the external interface for GUP". - SeongJae Park wishes to migrate people from DAMON's debugfs interface over to its sysfs interface. To support this, we'll temporarily be printing warnings when people use the debugfs interface. See the series "mm/damon: deprecate DAMON debugfs interface". - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes and clean-ups" series. - Huang Ying has provided a dramatic reduction in migration's TLB flush IPI rates with the series "migrate_pages(): batch TLB flushing". - Arnd Bergmann has some objtool fixups in "objtool warning fixes". * tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits) include/linux/migrate.h: remove unneeded externs mm/memory_hotplug: cleanup return value handing in do_migrate_range() mm/uffd: fix comment in handling pte markers mm: change to return bool for isolate_movable_page() mm: hugetlb: change to return bool for isolate_hugetlb() mm: change to return bool for isolate_lru_page() mm: change to return bool for folio_isolate_lru() objtool: add UACCESS exceptions for __tsan_volatile_read/write kmsan: disable ftrace in kmsan core code kasan: mark addr_has_metadata __always_inline mm: memcontrol: rename memcg_kmem_enabled() sh: initialize max_mapnr m68k/nommu: add missing definition of ARCH_PFN_OFFSET mm: percpu: fix incorrect size in pcpu_obj_full_size() maple_tree: reduce stack usage with gcc-9 and earlier mm: page_alloc: call panic() when memoryless node allocation fails mm: multi-gen LRU: avoid futile retries migrate_pages: move THP/hugetlb migration support check to simplify code migrate_pages: batch flushing TLB migrate_pages: share more code between _unmap and _move ...
| * | mm, arch: add generic implementation of pfn_valid() for FLATMEMMike Rapoport (IBM)2023-02-091-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Every architecture that supports FLATMEM memory model defines its own version of pfn_valid() that essentially compares a pfn to max_mapnr. Use mips/powerpc version implemented as static inline as a generic implementation of pfn_valid() and drop its per-architecture definitions. [rppt@kernel.org: fix the generic pfn_valid()] Link: https://lkml.kernel.org/r/Y9lg7R1Yd931C+y5@kernel.org Link: https://lkml.kernel.org/r/20230129124235.209895-5-rppt@kernel.org Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Huacai Chen <chenhuacai@loongson.cn> [LoongArch] Acked-by: Stafford Horne <shorne@gmail.com> [OpenRISC] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Reviewed-by: David Hildenbrand <david@redhat.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Cc: Brian Cain <bcain@quicinc.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVEDavid Hildenbrand2023-02-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | __HAVE_ARCH_PTE_SWP_EXCLUSIVE is now supported by all architectures that support swp PTEs, so let's drop it. Link: https://lkml.kernel.org/r/20230113171026.582290-27-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | nios2/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVEDavid Hildenbrand2023-02-022-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using the yet-unused bit 31. Link: https://lkml.kernel.org/r/20230113171026.582290-15-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | nios2/mm: refactor swap PTE layoutDavid Hildenbrand2023-02-021-8/+10
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | nios2 disables swap for a good reason: it doesn't even provide sufficient type bits as required by core MM. However, swap entries are nowadays also used for other purposes (migration entries, PTE markers, HWPoison, ...), and accidential use could be problematic. Let's properly use 5 bits for the swap type and document the layout. Bits 26--31 should get ignored by hardware completely, so they can be used. Link: https://lkml.kernel.org/r/20230113171026.582290-14-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | arch/idle: Change arch_cpu_idle() behavior: always exit with IRQs disabledPeter Zijlstra2023-01-131-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current arch_cpu_idle() is called with IRQs disabled, but will return with IRQs enabled. However, the very first thing the generic code does after calling arch_cpu_idle() is raw_local_irq_disable(). This means that architectures that can idle with IRQs disabled end up doing a pointless 'enable-disable' dance. Therefore, push this IRQ disabling into the idle function, meaning that those architectures can avoid the pointless IRQ state flipping. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Acked-by: Mark Rutland <mark.rutland@arm.com> [arm64] Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Guo Ren <guoren@kernel.org> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195540.618076436@infradead.org
* | objtool/idle: Validate __cpuidle code as noinstrPeter Zijlstra2023-01-131-1/+0
|/ | | | | | | | | | | | | | Idle code is very like entry code in that RCU isn't available. As such, add a little validation. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195540.373461409@infradead.org
* Merge tag 'mm-stable-2022-12-13' of ↵Linus Torvalds2022-12-133-10/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - More userfaultfs work from Peter Xu - Several convert-to-folios series from Sidhartha Kumar and Huang Ying - Some filemap cleanups from Vishal Moola - David Hildenbrand added the ability to selftest anon memory COW handling - Some cpuset simplifications from Liu Shixin - Addition of vmalloc tracing support by Uladzislau Rezki - Some pagecache folioifications and simplifications from Matthew Wilcox - A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use it - Miguel Ojeda contributed some cleanups for our use of the __no_sanitize_thread__ gcc keyword. This series should have been in the non-MM tree, my bad - Naoya Horiguchi improved the interaction between memory poisoning and memory section removal for huge pages - DAMON cleanups and tuneups from SeongJae Park - Tony Luck fixed the handling of COW faults against poisoned pages - Peter Xu utilized the PTE marker code for handling swapin errors - Hugh Dickins reworked compound page mapcount handling, simplifying it and making it more efficient - Removal of the autonuma savedwrite infrastructure from Nadav Amit and David Hildenbrand - zram support for multiple compression streams from Sergey Senozhatsky - David Hildenbrand reworked the GUP code's R/O long-term pinning so that drivers no longer need to use the FOLL_FORCE workaround which didn't work very well anyway - Mel Gorman altered the page allocator so that local IRQs can remnain enabled during per-cpu page allocations - Vishal Moola removed the try_to_release_page() wrapper - Stefan Roesch added some per-BDI sysfs tunables which are used to prevent network block devices from dirtying excessive amounts of pagecache - David Hildenbrand did some cleanup and repair work on KSM COW breaking - Nhat Pham and Johannes Weiner have implemented writeback in zswap's zsmalloc backend - Brian Foster has fixed a longstanding corner-case oddity in file[map]_write_and_wait_range() - sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang Chen - Shiyang Ruan has done some work on fsdax, to make its reflink mode work better under xfstests. Better, but still not perfect - Christoph Hellwig has removed the .writepage() method from several filesystems. They only need .writepages() - Yosry Ahmed wrote a series which fixes the memcg reclaim target beancounting - David Hildenbrand has fixed some of our MM selftests for 32-bit machines - Many singleton patches, as usual * tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (313 commits) mm/hugetlb: set head flag before setting compound_order in __prep_compound_gigantic_folio mm: mmu_gather: allow more than one batch of delayed rmaps mm: fix typo in struct pglist_data code comment kmsan: fix memcpy tests mm: add cond_resched() in swapin_walk_pmd_entry() mm: do not show fs mm pc for VM_LOCKONFAULT pages selftests/vm: ksm_functional_tests: fixes for 32bit selftests/vm: cow: fix compile warning on 32bit selftests/vm: madv_populate: fix missing MADV_POPULATE_(READ|WRITE) definitions mm/gup_test: fix PIN_LONGTERM_TEST_READ with highmem mm,thp,rmap: fix races between updates of subpages_mapcount mm: memcg: fix swapcached stat accounting mm: add nodes= arg to memory.reclaim mm: disable top-tier fallback to reclaim on proactive reclaim selftests: cgroup: make sure reclaim target memcg is unprotected selftests: cgroup: refactor proactive reclaim code to reclaim_until() mm: memcg: fix stale protection of reclaim target memcg mm/mmap: properly unaccount memory on mas_preallocate() failure omfs: remove ->writepage jfs: remove ->writepage ...
| * MIPS&LoongArch&NIOS2: adjust prototypes of p?d_init()Feiyang Chen2022-12-111-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "mm/sparse-vmemmap: Generalise helpers and enable for LoongArch", v14. This series is in order to enable sparse-vmemmap for LoongArch. But LoongArch cannot use generic helpers directly because MIPS&LoongArch need to call pgd_init()/pud_init()/pmd_init() when populating page tables. So we adjust the prototypes of p?d_init() to make generic helpers can call them, then enable sparse-vmemmap with generic helpers, and to be further, generalise vmemmap_populate_hugepages() for ARM64, X86 and LoongArch. This patch (of 4): We are preparing to add sparse vmemmap support to LoongArch. MIPS and LoongArch need to call pgd_init()/pud_init()/pmd_init() when populating page tables, so adjust their prototypes to make generic helpers can call them. NIOS2 declares pmd_init() but doesn't use, just remove it to avoid build errors. Link: https://lkml.kernel.org/r/20221027125253.3458989-1-chenhuacai@loongson.cn Link: https://lkml.kernel.org/r/20221027125253.3458989-2-chenhuacai@loongson.cn Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Will Deacon <will@kernel.org> Cc: Xuefeng Li <lixuefeng@loongson.cn> Cc: Xuerui Wang <kernel@xen0n.name> Cc: Min Zhou <zhoumin@loongson.cn> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * nios2: remove unused INIT_MMAPKefeng Wang2022-11-081-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "mm: cleanup with VM_ACCESS_FLAGS". This patch (of 5): It seems that INIT_MMAP is gone in 2.4.10, not sure, anyways, it is useless now, kill it. Link: https://lkml.kernel.org/r/20221019034945.93081-1-wangkefeng.wang@huawei.com Link: https://lkml.kernel.org/r/20221019034945.93081-2-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: David Airlie <airlied@gmail.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * mm: remove kern_addr_valid() completelyKefeng Wang2022-11-081-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most architectures (except arm64/x86/sparc) simply return 1 for kern_addr_valid(), which is only used in read_kcore(), and it calls copy_from_kernel_nofault() which could check whether the address is a valid kernel address. So as there is no need for kern_addr_valid(), let's remove it. Link: https://lkml.kernel.org/r/20221018074014.185687-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Heiko Carstens <hca@linux.ibm.com> [s390] Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Helge Deller <deller@gmx.de> [parisc] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: <aou@eecs.berkeley.edu> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jonas Bonn <jonas@southpole.se> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Xuerui Wang <kernel@xen0n.name> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag 'mm-nonmm-stable-2022-12-12' of ↵Linus Torvalds2022-12-121-3/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - A ptrace API cleanup series from Sergey Shtylyov - Fixes and cleanups for kexec from ye xingchen - nilfs2 updates from Ryusuke Konishi - squashfs feature work from Xiaoming Ni: permit configuration of the filesystem's compression concurrency from the mount command line - A series from Akinobu Mita which addresses bound checking errors when writing to debugfs files - A series from Yang Yingliang to address rapidio memory leaks - A series from Zheng Yejian to address possible overflow errors in encode_comp_t() - And a whole shower of singleton patches all over the place * tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (79 commits) ipc: fix memory leak in init_mqueue_fs() hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount rapidio: devices: fix missing put_device in mport_cdev_open kcov: fix spelling typos in comments hfs: Fix OOB Write in hfs_asc2mac hfs: fix OOB Read in __hfs_brec_find relay: fix type mismatch when allocating memory in relay_create_buf() ocfs2: always read both high and low parts of dinode link count io-mapping: move some code within the include guarded section kernel: kcsan: kcsan_test: build without structleak plugin mailmap: update email for Iskren Chernev eventfd: change int to __u64 in eventfd_signal() ifndef CONFIG_EVENTFD rapidio: fix possible UAF when kfifo_alloc() fails relay: use strscpy() is more robust and safer cpumask: limit visibility of FORCE_NR_CPUS acct: fix potential integer overflow in encode_comp_t() acct: fix accuracy loss for input value of encode_comp_t() linux/init.h: include <linux/build_bug.h> and <linux/stringify.h> rapidio: rio: fix possible name leak in rio_register_mport() rapidio: fix possible name leaks when rio_add_device() fails ...
| * | nios2: ptrace: user_regset_copyin_ignore() always returns 0Sergey Shtylyov2022-11-151-3/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | user_regset_copyin_ignore() always returns 0, so checking its result seems pointless -- don't do this anymore... Link: https://lkml.kernel.org/r/20221014212235.10770-8-s.shtylyov@omp.ru Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* / nios2: add FORCE for vmlinuz.gzRandy Dunlap2022-11-271-1/+1
|/ | | | | | | | | | Add FORCE to placate a warning from make: arch/nios2/boot/Makefile:24: FORCE prerequisite is missing Fixes: 2fc8483fdcde ("nios2: Build infrastructure") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
* Merge tag 'mm-nonmm-stable-2022-10-11' of ↵Linus Torvalds2022-10-123-7/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - hfs and hfsplus kmap API modernization (Fabio Francesco) - make crash-kexec work properly when invoked from an NMI-time panic (Valentin Schneider) - ntfs bugfixes (Hawkins Jiawei) - improve IPC msg scalability by replacing atomic_t's with percpu counters (Jiebin Sun) - nilfs2 cleanups (Minghao Chi) - lots of other single patches all over the tree! * tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits) include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype proc: test how it holds up with mapping'less process mailmap: update Frank Rowand email address ia64: mca: use strscpy() is more robust and safer init/Kconfig: fix unmet direct dependencies ia64: update config files nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure fork: remove duplicate included header files init/main.c: remove unnecessary (void*) conversions proc: mark more files as permanent nilfs2: remove the unneeded result variable nilfs2: delete unnecessary checks before brelse() checkpatch: warn for non-standard fixes tag style usr/gen_init_cpio.c: remove unnecessary -1 values from int file ipc/msg: mitigate the lock contention with percpu counter percpu: add percpu_counter_add_local and percpu_counter_sub_local fs/ocfs2: fix repeated words in comments relay: use kvcalloc to alloc page array in relay_alloc_page_array proc: make config PROC_CHILDREN depend on PROC_FS fs: uninline inode_maybe_inc_iversion() ...
| * kernel: exit: cleanup release_thread()Kefeng Wang2022-09-111-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only x86 has own release_thread(), introduce a new weak release_thread() function to clean empty definitions in other ARCHs. Link: https://lkml.kernel.org/r/20220819014406.32266-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Brian Cain <bcain@quicinc.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Stafford Horne <shorne@gmail.com> [openrisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Huacai Chen <chenhuacai@kernel.org> [LoongArch] Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> [csky] Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jonas Bonn <jonas@southpole.se> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Xuerui Wang <kernel@xen0n.name> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * treewide: defconfig: address renamed CONFIG_DEBUG_INFO=yArnd Bergmann2022-09-112-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CONFIG_DEBUG_INFO is now implicitly selected if one picks one of the explicit options that could be DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT, DEBUG_INFO_DWARF4, DEBUG_INFO_DWARF5. This was actually not what I had in mind when I suggested making it a 'choice' statement, but it's too late to change again now, and the Kconfig logic is more sensible in the new form. Change any defconfig file that had CONFIG_DEBUG_INFO enabled but did not pick DWARF4 or DWARF5 explicitly to now pick the toolchain default. Link: https://lkml.kernel.org/r/20220811114609.2097335-1-arnd@kernel.org Fixes: f9b3cd245784 ("Kconfig.debug: make DEBUG_INFO selectable from a choice") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Yoshinori Sato <ysato@users.osdn.me> Cc: Rich Felker <dalias@libc.org> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag 'mm-stable-2022-10-08' of ↵Linus Torvalds2022-10-101-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in linux-next for a couple of months without, to my knowledge, any negative reports (or any positive ones, come to that). - Also the Maple Tree from Liam Howlett. An overlapping range-based tree for vmas. It it apparently slightly more efficient in its own right, but is mainly targeted at enabling work to reduce mmap_lock contention. Liam has identified a number of other tree users in the kernel which could be beneficially onverted to mapletrees. Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat at [1]. This has yet to be addressed due to Liam's unfortunately timed vacation. He is now back and we'll get this fixed up. - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses clang-generated instrumentation to detect used-unintialized bugs down to the single bit level. KMSAN keeps finding bugs. New ones, as well as the legacy ones. - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of memory into THPs. - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support file/shmem-backed pages. - userfaultfd updates from Axel Rasmussen - zsmalloc cleanups from Alexey Romanov - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure - Huang Ying adds enhancements to NUMA balancing memory tiering mode's page promotion, with a new way of detecting hot pages. - memcg updates from Shakeel Butt: charging optimizations and reduced memory consumption. - memcg cleanups from Kairui Song. - memcg fixes and cleanups from Johannes Weiner. - Vishal Moola provides more folio conversions - Zhang Yi removed ll_rw_block() :( - migration enhancements from Peter Xu - migration error-path bugfixes from Huang Ying - Aneesh Kumar added ability for a device driver to alter the memory tiering promotion paths. For optimizations by PMEM drivers, DRM drivers, etc. - vma merging improvements from Jakub Matěn. - NUMA hinting cleanups from David Hildenbrand. - xu xin added aditional userspace visibility into KSM merging activity. - THP & KSM code consolidation from Qi Zheng. - more folio work from Matthew Wilcox. - KASAN updates from Andrey Konovalov. - DAMON cleanups from Kaixu Xia. - DAMON work from SeongJae Park: fixes, cleanups. - hugetlb sysfs cleanups from Muchun Song. - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core. Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1] * tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits) hugetlb: allocate vma lock for all sharable vmas hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer hugetlb: fix vma lock handling during split vma and range unmapping mglru: mm/vmscan.c: fix imprecise comments mm/mglru: don't sync disk for each aging cycle mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol mm: memcontrol: use do_memsw_account() in a few more places mm: memcontrol: deprecate swapaccounting=0 mode mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled mm/secretmem: remove reduntant return value mm/hugetlb: add available_huge_pages() func mm: remove unused inline functions from include/linux/mm_inline.h selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd selftests/vm: add thp collapse shmem testing selftests/vm: add thp collapse file and tmpfs testing selftests/vm: modularize thp collapse memory operations selftests/vm: dedup THP helpers mm/khugepaged: add tracepoint to hpage_collapse_scan_file() mm/madvise: add file and shmem support to MADV_COLLAPSE ...
| * | arch: mm: rename FORCE_MAX_ZONEORDER to ARCH_FORCE_MAX_ORDERZi Yan2022-09-111-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This Kconfig option is used by individual arch to set its desired MAX_ORDER. Rename it to reflect its actual use. Link: https://lkml.kernel.org/r/20220815143959.1511278-1-zi.yan@sent.com Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Zi Yan <ziy@nvidia.com> Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Huacai Chen <chenhuacai@kernel.org> [LoongArch] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Taichi Sugaya <sugaya.taichi@socionext.com> Cc: Neil Armstrong <narmstrong@baylibre.com> Cc: Qin Jian <qinjian@cqplus1.com> Cc: Guo Ren <guoren@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | kbuild: remove head-y syntaxMasahiro Yamada2022-10-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kbuild puts the objects listed in head-y at the head of vmlinux. Conventionally, we do this for head*.S, which contains the kernel entry point. A counter approach is to control the section order by the linker script. Actually, the code marked as __HEAD goes into the ".head.text" section, which is placed before the normal ".text" section. I do not know if both of them are needed. From the build system perspective, head-y is not mandatory. If you can achieve the proper code placement by the linker script only, it would be cleaner. I collected the current head-y objects into head-object-list.txt. It is a whitelist. My hope is it will be reduced in the long run. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
* | kbuild: use obj-y instead extra-y for objects placed at the headMasahiro Yamada2022-10-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The objects placed at the head of vmlinux need special treatments: - arch/$(SRCARCH)/Makefile adds them to head-y in order to place them before other archives in the linker command line. - arch/$(SRCARCH)/kernel/Makefile adds them to extra-y instead of obj-y to avoid them going into built-in.a. This commit gets rid of the latter. Create vmlinux.a to collect all the objects that are unconditionally linked to vmlinux. The objects listed in head-y are moved to the head of vmlinux.a by using 'ar m'. With this, arch/$(SRCARCH)/kernel/Makefile can consistently use obj-y for builtin objects. There is no *.o that is directly linked to vmlinux. Drop unneeded code in scripts/clang-tools/gen_compile_commands.py. $(AR) mPi needs 'T' to workaround the llvm-ar bug. The fix was suggested by Nathan Chancellor [1]. [1]: https://lore.kernel.org/llvm/YyjjT5gQ2hGMH0ni@dev-arch.thelio-3990X/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
* | nios2: move core-y in arch/nios2/Makefile to arch/nios2/KbuildMasahiro Yamada2022-09-292-4/+2
|/ | | | | | Use obj-y to clean up Makefile. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
* nios2: add force_successful_syscall_return()Al Viro2022-08-152-0/+8
| | | | | | | | | | | | | | | | If we use the ancient SysV syscall ABI, we'd better have tell the kernel how to claim that a negative return value is a success. Use ->orig_r2 for that - it's inaccessible via ptrace, so it's a fair game for changes and it's normally[*] non-negative on return from syscall. Set to -1; syscall is not going to be restart-worthy by definition, so we won't interfere with that use either. [*] the only exception is rt_sigreturn(), where we skip the entire messing with r1/r2 anyway. Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
* nios2: restarts apply only to the first sigframe we build...Al Viro2022-08-151-0/+1
| | | | | | Fixes: b53e906d255d ("nios2: Signal handling support") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
* nios2: fix syscall restart checksAl Viro2022-08-151-1/+1
| | | | | | | | | | | | | sys_foo() returns -512 (aka -ERESTARTSYS) => do_signal() sees 512 in r2 and 1 in r1. sys_foo() returns 512 => do_signal() sees 512 in r2 and 0 in r1. The former is restart-worthy; the latter obviously isn't. Fixes: b53e906d255d ("nios2: Signal handling support") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
* nios2: traced syscall does need to check the syscall numberAl Viro2022-08-151-3/+8
| | | | | | | | | all checks done before letting the tracer modify the register state are worthless... Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
* nios2: don't leave NULLs in sys_call_table[]Al Viro2022-08-152-1/+1
| | | | | | | | fill the gaps in there with sys_ni_syscall, as everyone does... Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
* nios2: page fault et.al. are *not* restartable syscalls...Al Viro2022-08-152-4/+3
| | | | | | | | | make sure that ->orig_r2 is negative for everything except the syscalls. Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
* nios2: drop definition of PGD_ORDERMike Rapoport2022-07-173-6/+3
| | | | | | | | | | | | | | | | | | | This is the order of the page table allocation, not the order of a PGD. Since its always hardwired to 0, simply drop it. Link: https://lkml.kernel.org/r/20220703141203.147893-9-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Xuerui Wang <kernel@xen0n.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* nios2: drop definition of PTE_ORDERMike Rapoport2022-07-172-3/+2
| | | | | | | | | | | | | | | | | | | This is the order of the page table allocation, not the order of a PTE. Since its always hardwired to 0, simply drop it. Link: https://lkml.kernel.org/r/20220703141203.147893-8-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Xuerui Wang <kernel@xen0n.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* mm/mmap: drop ARCH_HAS_VM_GET_PAGE_PROTAnshuman Khandual2022-07-171-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now all the platforms enable ARCH_HAS_GET_PAGE_PROT. They define and export own vm_get_page_prot() whether custom or standard DECLARE_VM_GET_PAGE_PROT. Hence there is no need for default generic fallback for vm_get_page_prot(). Just drop this fallback and also ARCH_HAS_GET_PAGE_PROT mechanism. Link: https://lkml.kernel.org/r/20220711070600.2378316-27-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* nios2/mm: enable ARCH_HAS_VM_GET_PAGE_PROTAnshuman Khandual2022-07-173-16/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This enables ARCH_HAS_VM_GET_PAGE_PROT on the platform and exports standard vm_get_page_prot() implementation via DECLARE_VM_GET_PAGE_PROT, which looks up a private and static protection_map[] array. Subsequently all __SXXX and __PXXX macros can be dropped which are no longer needed. Link: https://lkml.kernel.org/r/20220711070600.2378316-16-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Christoph Hellwig <hch@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* mm: avoid unnecessary page fault retires on shared memory typesPeter Xu2022-06-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I observed that for each of the shared file-backed page faults, we're very likely to retry one more time for the 1st write fault upon no page. It's because we'll need to release the mmap lock for dirty rate limit purpose with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()). Then after that throttling we return VM_FAULT_RETRY. We did that probably because VM_FAULT_RETRY is the only way we can return to the fault handler at that time telling it we've released the mmap lock. However that's not ideal because it's very likely the fault does not need to be retried at all since the pgtable was well installed before the throttling, so the next continuous fault (including taking mmap read lock, walk the pgtable, etc.) could be in most cases unnecessary. It's not only slowing down page faults for shared file-backed, but also add more mmap lock contention which is in most cases not needed at all. To observe this, one could try to write to some shmem page and look at "pgfault" value in /proc/vmstat, then we should expect 2 counts for each shmem write simply because we retried, and vm event "pgfault" will capture that. To make it more efficient, add a new VM_FAULT_COMPLETED return code just to show that we've completed the whole fault and released the lock. It's also a hint that we should very possibly not need another fault immediately on this page because we've just completed it. This patch provides a ~12% perf boost on my aarch64 test VM with a simple program sequentially dirtying 400MB shmem file being mmap()ed and these are the time it needs: Before: 650.980 ms (+-1.94%) After: 569.396 ms (+-1.38%) I believe it could help more than that. We need some special care on GUP and the s390 pgfault handler (for gmap code before returning from pgfault), the rest changes in the page fault handlers should be relatively straightforward. Another thing to mention is that mm_account_fault() does take this new fault as a generic fault to be accounted, unlike VM_FAULT_RETRY. I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping them as-is. Link: https://lkml.kernel.org/r/20220530183450.42886-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vineet Gupta <vgupta@kernel.org> Acked-by: Guo Ren <guoren@kernel.org> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm part] Acked-by: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Brian Cain <bcain@quicinc.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Weinberger <richard@nod.at> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Will Deacon <will@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Simek <monstr@monstr.eu> Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: David Hildenbrand <david@redhat.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Rich Felker <dalias@libc.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Helge Deller <deller@gmx.de> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* Merge tag 'kthread-cleanups-for-v5.19' of ↵Linus Torvalds2022-06-031-5/+7
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull kthread updates from Eric Biederman: "This updates init and user mode helper tasks to be ordinary user mode tasks. Commit 40966e316f86 ("kthread: Ensure struct kthread is present for all kthreads") caused init and the user mode helper threads that call kernel_execve to have struct kthread allocated for them. This struct kthread going away during execve in turned made a use after free of struct kthread possible. Here, commit 343f4c49f243 ("kthread: Don't allocate kthread_struct for init and umh") is enough to fix the use after free and is simple enough to be backportable. The rest of the changes pass struct kernel_clone_args to clean things up and cause the code to make sense. In making init and the user mode helpers tasks purely user mode tasks I ran into two complications. The function task_tick_numa was detecting tasks without an mm by testing for the presence of PF_KTHREAD. The initramfs code in populate_initrd_image was using flush_delayed_fput to ensuere the closing of all it's file descriptors was complete, and flush_delayed_fput does not work in a userspace thread. I have looked and looked and more complications and in my code review I have not found any, and neither has anyone else with the code sitting in linux-next" * tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: sched: Update task_tick_numa to ignore tasks without an mm fork: Stop allowing kthreads to call execve fork: Explicitly set PF_KTHREAD init: Deal with the init process being a user mode process fork: Generalize PF_IO_WORKER handling fork: Explicity test for idle tasks in copy_thread fork: Pass struct kernel_clone_args into copy_thread kthread: Don't allocate kthread_struct for init and umh
| * fork: Generalize PF_IO_WORKER handlingEric W. Biederman2022-05-071-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add fn and fn_arg members into struct kernel_clone_args and test for them in copy_thread (instead of testing for PF_KTHREAD | PF_IO_WORKER). This allows any task that wants to be a user space task that only runs in kernel mode to use this functionality. The code on x86 is an exception and still retains a PF_KTHREAD test because x86 unlikely everything else handles kthreads slightly differently than user space tasks that start with a function. The functions that created tasks that start with a function have been updated to set ".fn" and ".fn_arg" instead of ".stack" and ".stack_size". These functions are fork_idle(), create_io_thread(), kernel_thread(), and user_mode_thread(). Link: https://lkml.kernel.org/r/20220506141512.516114-4-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * fork: Pass struct kernel_clone_args into copy_threadEric W. Biederman2022-05-071-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With io_uring we have started supporting tasks that are for most purposes user space tasks that exclusively run code in kernel mode. The kernel task that exec's init and tasks that exec user mode helpers are also user mode tasks that just run kernel code until they call kernel execve. Pass kernel_clone_args into copy_thread so these oddball tasks can be supported more cleanly and easily. v2: Fix spelling of kenrel_clone_args on h8300 Link: https://lkml.kernel.org/r/20220506141512.516114-2-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | Merge tag 'kbuild-v5.19' of ↵Linus Torvalds2022-05-262-24/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Add HOSTPKG_CONFIG env variable to allow users to override pkg-config - Support W=e as a shorthand for KCFLAGS=-Werror - Fix CONFIG_IKHEADERS build to support toybox cpio - Add scripts/dummy-tools/pahole to ease distro packagers' life - Suppress false-positive warnings from checksyscalls.sh for W=2 build - Factor out the common code of arch/*/boot/install.sh into scripts/install.sh - Support 'kernel-install' tool in scripts/prune-kernel - Refactor module-versioning to link the symbol versions at the final link of vmlinux and modules - Remove CONFIG_MODULE_REL_CRCS because module-versioning now works in an arch-agnostic way - Refactor modpost, Makefiles * tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (56 commits) genksyms: adjust the output format to modpost kbuild: stop merging *.symversions kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS modpost: extract symbol versions from *.cmd files modpost: add sym_find_with_module() helper modpost: change the license of EXPORT_SYMBOL to bool type modpost: remove left-over cross_compile declaration kbuild: record symbol versions in *.cmd files kbuild: generate a list of objects in vmlinux modpost: move *.mod.c generation to write_mod_c_files() modpost: merge add_{intree_flag,retpoline,staging_flag} to add_header scripts/prune-kernel: Use kernel-install if available kbuild: factor out the common installation code into scripts/install.sh modpost: split new_symbol() to symbol allocation and hash table addition modpost: make sym_add_exported() always allocate a new symbol modpost: make multiple export error modpost: dump Module.symvers in the same order of modules.order modpost: traverse the namespace_list in order modpost: use doubly linked list for dump_lists modpost: traverse unresolved symbols in order ...
| * | kbuild: factor out the common installation code into scripts/install.shMasahiro Yamada2022-05-112-24/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many architectures have similar install.sh scripts. The first half is really generic; it verifies that the kernel image and System.map exist, then executes ~/bin/${INSTALLKERNEL} or /sbin/${INSTALLKERNEL} if available. The second half is kind of arch-specific; it copies the kernel image and System.map to the destination, but the code is slightly different. Factor out the generic part into scripts/install.sh. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <n.schier@avm.de>
* / nios2: use fallback for random_get_entropy() instead of zeroJason A. Donenfeld2022-05-131-0/+3
|/ | | | | | | | | | | | | | | | In the event that random_get_entropy() can't access a cycle counter or similar, falling back to returning 0 is really not the best we can do. Instead, at least calling random_get_entropy_fallback() would be preferable, because that always needs to return _something_, even falling back to jiffies eventually. It's not as though random_get_entropy_fallback() is super high precision or guaranteed to be entropic, but basically anything that's not zero all the time is better than returning zero all the time. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* Merge tag 'ptrace-cleanups-for-v5.18' of ↵Linus Torvalds2022-03-282-5/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull ptrace cleanups from Eric Biederman: "This set of changes removes tracehook.h, moves modification of all of the ptrace fields inside of siglock to remove races, adds a missing permission check to ptrace.c The removal of tracehook.h is quite significant as it has been a major source of confusion in recent years. Much of that confusion was around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled making the semantics clearer). For people who don't know tracehook.h is a vestiage of an attempt to implement uprobes like functionality that was never fully merged, and was later superseeded by uprobes when uprobes was merged. For many years now we have been removing what tracehook functionaly a little bit at a time. To the point where anything left in tracehook.h was some weird strange thing that was difficult to understand" * tag 'ptrace-cleanups-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: ptrace: Remove duplicated include in ptrace.c ptrace: Check PTRACE_O_SUSPEND_SECCOMP permission on PTRACE_SEIZE ptrace: Return the signal to continue with from ptrace_stop ptrace: Move setting/clearing ptrace_message into ptrace_stop tracehook: Remove tracehook.h resume_user_mode: Move to resume_user_mode.h resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume signal: Move set_notify_signal and clear_notify_signal into sched/signal.h task_work: Decouple TIF_NOTIFY_SIGNAL and task_work task_work: Call tracehook_notify_signal from get_signal on all architectures task_work: Introduce task_work_pending task_work: Remove unnecessary include from posix_timers.h ptrace: Remove tracehook_signal_handler ptrace: Remove arch_syscall_{enter,exit}_tracehook ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h ptrace/arm: Rename tracehook_report_syscall report_syscall ptrace: Move ptrace_report_syscall into ptrace.h
| * resume_user_mode: Move to resume_user_mode.hEric W. Biederman2022-03-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move set_notify_resume and tracehook_notify_resume into resume_user_mode.h. While doing that rename tracehook_notify_resume to resume_user_mode_work. Update all of the places that included tracehook.h for these functions to include resume_user_mode.h instead. Update all of the callers of tracehook_notify_resume to call resume_user_mode_work. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-12-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.hEric W. Biederman2022-03-101-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | Rename tracehook_report_syscall_{entry,exit} to ptrace_report_syscall_{entry,exit} and place them in ptrace.h There is no longer any generic tracehook infractructure so make these ptrace specific functions ptrace specific. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-3-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | Merge tag 'asm-generic-5.18' of ↵Linus Torvalds2022-03-234-74/+61
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are three sets of updates for 5.18 in the asm-generic tree: - The set_fs()/get_fs() infrastructure gets removed for good. This was already gone from all major architectures, but now we can finally remove it everywhere, which loses some particularly tricky and error-prone code. There is a small merge conflict against a parisc cleanup, the solution is to use their new version. - The nds32 architecture ends its tenure in the Linux kernel. The hardware is still used and the code is in reasonable shape, but the mainline port is not actively maintained any more, as all remaining users are thought to run vendor kernels that would never be updated to a future release. - A series from Masahiro Yamada cleans up some of the uapi header files to pass the compile-time checks" * tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (27 commits) nds32: Remove the architecture uaccess: remove CONFIG_SET_FS ia64: remove CONFIG_SET_FS support sh: remove CONFIG_SET_FS support sparc64: remove CONFIG_SET_FS support lib/test_lockup: fix kernel pointer check for separate address spaces uaccess: generalize access_ok() uaccess: fix type mismatch warnings from access_ok() arm64: simplify access_ok() m68k: fix access_ok for coldfire MIPS: use simpler access_ok() MIPS: Handle address errors for accesses above CPU max virtual user address uaccess: add generic __{get,put}_kernel_nofault nios2: drop access_ok() check from __put_user() x86: use more conventional access_ok() definition x86: remove __range_not_ok() sparc64: add __{get,put}_kernel_nofault() nds32: fix access_ok() checks in get/put_user uaccess: fix nios2 and microblaze get_user_8() sparc64: fix building assembly files ...