summaryrefslogtreecommitdiffstats
path: root/arch/mips
Commit message (Collapse)AuthorAgeFilesLines
...
| | * | | | | | MIPS: use simpler access_ok()Arnd Bergmann2022-02-251-18/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before unifying the mips version of __access_ok() with the generic code, this converts it to the same algorithm. This is a change in behavior on mips64, as now address in the user segment, the lower 2^62 bytes, is taken to be valid, relying on a page fault for addresses that are within that segment but not valid on that CPU. The new version should be the most effecient way to do this, but it gets rid of the special handling for size=0 that most other architectures ignore as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| | * | | | | | MIPS: Handle address errors for accesses above CPU max virtual user addressThomas Bogendoerfer2022-02-251-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Address errors have always been treated as unaliged accesses and handled as such. But address errors are also issued for illegal accesses like user to kernel space or accesses outside of implemented spaces. This change implements Linux exception handling for accesses to the illegal space above the CPU implemented maximum virtual user address and the MIPS 64bit architecture maximum. With this we can now use a fixed value for the maximum task size on every MIPS CPU and get a more optimized access_ok(). Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| | * | | | | | uaccess: add generic __{get,put}_kernel_nofaultArnd Bergmann2022-02-251-2/+0
| | | |_|_|_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nine architectures are still missing __{get,put}_kernel_nofault: alpha, ia64, microblaze, nds32, nios2, openrisc, sh, sparc32, xtensa. Add a generic version that lets everything use the normal copy_{from,to}_kernel_nofault() code based on these, removing the last use of get_fs()/set_fs() from architecture-independent code. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| * | | | | | shmbuf.h: add asm/shmbuf.h to UAPI compile-test coverageMasahiro Yamada2022-02-171-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | asm/shmbuf.h is currently excluded from the UAPI compile-test because of the errors like follows: HDRTEST usr/include/asm/shmbuf.h In file included from ./usr/include/asm/shmbuf.h:6, from <command-line>: ./usr/include/asm-generic/shmbuf.h:26:33: error: field ‘shm_perm’ has incomplete type 26 | struct ipc64_perm shm_perm; /* operation perms */ | ^~~~~~~~ ./usr/include/asm-generic/shmbuf.h:27:9: error: unknown type name ‘size_t’ 27 | size_t shm_segsz; /* size of segment (bytes) */ | ^~~~~~ ./usr/include/asm-generic/shmbuf.h:40:9: error: unknown type name ‘__kernel_pid_t’ 40 | __kernel_pid_t shm_cpid; /* pid of creator */ | ^~~~~~~~~~~~~~ ./usr/include/asm-generic/shmbuf.h:41:9: error: unknown type name ‘__kernel_pid_t’ 41 | __kernel_pid_t shm_lpid; /* pid of last operator */ | ^~~~~~~~~~~~~~ The errors can be fixed by replacing size_t with __kernel_size_t and by including proper headers. Then, remove the no-header-test entry from user/include/Makefile. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| * | | | | | signal.h: add linux/signal.h and asm/signal.h to UAPI compile-test coverageMasahiro Yamada2022-02-171-1/+1
| |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | linux/signal.h and asm/signal.h are currently excluded from the UAPI compile-test because of the errors like follows: HDRTEST usr/include/asm/signal.h In file included from <command-line>: ./usr/include/asm/signal.h:103:9: error: unknown type name ‘size_t’ 103 | size_t ss_size; | ^~~~~~ The errors can be fixed by replacing size_t with __kernel_size_t. Then, remove the no-header-test entries from user/include/Makefile. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* | | | | | Merge tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds2022-03-221-5/+5
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull folio updates from Matthew Wilcox: - Rewrite how munlock works to massively reduce the contention on i_mmap_rwsem (Hugh Dickins): https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/ - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig): https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/ - Convert GUP to use folios and make pincount available for order-1 pages. (Matthew Wilcox) - Convert a few more truncation functions to use folios (Matthew Wilcox) - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox) - Convert rmap_walk to use folios (Matthew Wilcox) - Convert most of shrink_page_list() to use a folio (Matthew Wilcox) - Add support for creating large folios in readahead (Matthew Wilcox) * tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits) mm/damon: minor cleanup for damon_pa_young selftests/vm/transhuge-stress: Support file-backed PMD folios mm/filemap: Support VM_HUGEPAGE for file mappings mm/readahead: Switch to page_cache_ra_order mm/readahead: Align file mappings for non-DAX mm/readahead: Add large folio readahead mm: Support arbitrary THP sizes mm: Make large folios depend on THP mm: Fix READ_ONLY_THP warning mm/filemap: Allow large folios to be added to the page cache mm: Turn can_split_huge_page() into can_split_folio() mm/vmscan: Convert pageout() to take a folio mm/vmscan: Turn page_check_references() into folio_check_references() mm/vmscan: Account large folios correctly mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios mm/vmscan: Free non-shmem folios without splitting them mm/rmap: Constify the rmap_walk_control argument mm/rmap: Convert rmap_walk() to take a folio mm: Turn page_anon_vma() into folio_anon_vma() mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read() ...
| * | | | | | mips: Make pmd_pfn() available in all configurationsMatthew Wilcox (Oracle)2022-03-211-5/+5
| | |_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whether or not the platform supports PMD sized pages, we need to provide pmd_pfn() for an upcoming patch. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
* | | | | | Merge branch 'akpm' (patches from Andrew)Linus Torvalds2022-03-221-5/+0
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge updates from Andrew Morton: - A few misc subsystems: kthread, scripts, ntfs, ocfs2, block, and vfs - Most the MM patches which precede the patches in Willy's tree: kasan, pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap, sparsemem, vmalloc, pagealloc, memory-failure, mlock, hugetlb, userfaultfd, vmscan, compaction, mempolicy, oom-kill, migration, thp, cma, autonuma, psi, ksm, page-poison, madvise, memory-hotplug, rmap, zswap, uaccess, ioremap, highmem, cleanups, kfence, hmm, and damon. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (227 commits) mm/damon/sysfs: remove repeat container_of() in damon_sysfs_kdamond_release() Docs/ABI/testing: add DAMON sysfs interface ABI document Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface selftests/damon: add a test for DAMON sysfs interface mm/damon/sysfs: support DAMOS stats mm/damon/sysfs: support DAMOS watermarks mm/damon/sysfs: support schemes prioritization mm/damon/sysfs: support DAMOS quotas mm/damon/sysfs: support DAMON-based Operation Schemes mm/damon/sysfs: support the physical address space monitoring mm/damon/sysfs: link DAMON for virtual address spaces monitoring mm/damon: implement a minimal stub for sysfs-based DAMON interface mm/damon/core: add number of each enum type values mm/damon/core: allow non-exclusive DAMON start/stop Docs/damon: update outdated term 'regions update interval' Docs/vm/damon/design: update DAMON-Idle Page Tracking interference handling Docs/vm/damon: call low level monitoring primitives the operations mm/damon: remove unnecessary CONFIG_DAMON option mm/damon/paddr,vaddr: remove damon_{p,v}a_{target_valid,set_operations}() mm/damon/dbgfs-test: fix is_target_id() change ...
| * | | | | | drivers/base/node: consolidate node device subsystem initialization in ↵David Hildenbrand2022-03-221-5/+0
| | |_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | node_dev_init() ... and call node_dev_init() after memory_dev_init() from driver_init(), so before any of the existing arch/subsys calls. All online nodes should be known at that point: early during boot, arch code determines node and zone ranges and sets the relevant nodes online; usually this happens in setup_arch(). This is in line with memory_dev_init(), which initializes the memory device subsystem and creates all memory block devices. Similar to memory_dev_init(), panic() if anything goes wrong, we don't want to continue with such basic initialization errors. The important part is that node_dev_init() gets called after memory_dev_init() and after cpu_dev_init(), but before any of the relevant archs call register_cpu() to register the new cpu device under the node device. The latter should be the case for the current users of topology_init(). Link: https://lkml.kernel.org/r/20220203105212.30385-1-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Tested-by: Anatoly Pugachev <matorola@gmail.com> (sparc64) Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Mike Rapoport <rppt@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | Merge tag 'nfsd-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds2022-03-2215-15/+0
|\ \ \ \ \ \ | |/ / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd updates from Chuck Lever: "New features: - NFSv3 support in NFSD is now always built - Added NFSD support for the NFSv4 birth-time file attribute - Added support for storing and displaying sockaddrs in trace points - NFSD now recognizes RPC_AUTH_TLS probes Performance improvements: - Optimized the svc transport enqueuing mechanism - Added micro-optimizations for the duplicate reply cache Notable bug fixes: - Allocation of the NFSD file cache hash table is more reliable" * tag 'nfsd-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (30 commits) nfsd: fix using the correct variable for sizeof() nfsd: use correct format characters NFSD: prevent integer overflow on 32 bit systems NFSD: prevent underflow in nfssvc_decode_writeargs() fs/lock: documentation cleanup. Replace inode->i_lock with flc_lock. NFSD: Fix nfsd_breaker_owns_lease() return values NFSD: Clean up _lm_ operation names arch: Remove references to CONFIG_NFSD_V3 in the default configs NFSD: Remove CONFIG_NFSD_V3 nfsd: more robust allocation failure handling in nfsd_file_cache_init SUNRPC: Teach server to recognize RPC_AUTH_TLS NFSD: Move svc_serv_ops::svo_function into struct svc_serv NFSD: Remove svc_serv_ops::svo_module SUNRPC: Remove svc_shutdown_net() SUNRPC: Rename svc_close_xprt() SUNRPC: Rename svc_create_xprt() SUNRPC: Remove svo_shutdown method SUNRPC: Merge svc_do_enqueue_xprt() into svc_enqueue_xprt() SUNRPC: Remove the .svo_enqueue_xprt method SUNRPC: Record endpoint information in trace log ...
| * | | | | arch: Remove references to CONFIG_NFSD_V3 in the default configsChuck Lever2022-03-1115-15/+0
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CONFIG_NFSD_V3 has been removed. NFSD support for NFSv3 can no longer be disabled. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* | | | | MIPS: ralink: mt7621: use bitwise NOT instead of logicalIlya Lipnitskiy2022-03-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was the intention to reverse the bits, not make them all zero by using logical NOT operator. Fixes: cc19db8b312a ("MIPS: ralink: mt7621: do memory detection on KSEG1") Suggested-by: Chuanhong Guo <gch981213@gmail.com> Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> Reviewed-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
* | | | | mips: setup: fix setnocoherentio() boolean settingRandy Dunlap2022-02-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Correct a typo/pasto: setnocoherentio() should set dma_default_coherent to false, not true. Fixes: 14ac09a65e19 ("MIPS: refactor the runtime coherent vs noncoherent DMA indicators") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-mips@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
* | | | | MIPS: smp: fill in sibling and core maps earlierAlexander Lobakin2022-02-161-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After enabling CONFIG_SCHED_CORE (landed during 5.14 cycle), 2-core 2-thread-per-core interAptiv (CPS-driven) started emitting the following: [ 0.025698] CPU1 revision is: 0001a120 (MIPS interAptiv (multi)) [ 0.048183] ------------[ cut here ]------------ [ 0.048187] WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:6025 sched_core_cpu_starting+0x198/0x240 [ 0.048220] Modules linked in: [ 0.048233] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.17.0-rc3+ #35 b7b319f24073fd9a3c2aa7ad15fb7993eec0b26f [ 0.048247] Stack : 817f0000 00000004 327804c8 810eb050 00000000 00000004 00000000 c314fdd1 [ 0.048278] 830cbd64 819c0000 81800000 817f0000 83070bf4 00000001 830cbd08 00000000 [ 0.048307] 00000000 00000000 815fcbc4 00000000 00000000 00000000 00000000 00000000 [ 0.048334] 00000000 00000000 00000000 00000000 817f0000 00000000 00000000 817f6f34 [ 0.048361] 817f0000 818a3c00 817f0000 00000004 00000000 00000000 4dc33260 0018c933 [ 0.048389] ... [ 0.048396] Call Trace: [ 0.048399] [<8105a7bc>] show_stack+0x3c/0x140 [ 0.048424] [<8131c2a0>] dump_stack_lvl+0x60/0x80 [ 0.048440] [<8108b5c0>] __warn+0xc0/0xf4 [ 0.048454] [<8108b658>] warn_slowpath_fmt+0x64/0x10c [ 0.048467] [<810bd418>] sched_core_cpu_starting+0x198/0x240 [ 0.048483] [<810c6514>] sched_cpu_starting+0x14/0x80 [ 0.048497] [<8108c0f8>] cpuhp_invoke_callback_range+0x78/0x140 [ 0.048510] [<8108d914>] notify_cpu_starting+0x94/0x140 [ 0.048523] [<8106593c>] start_secondary+0xbc/0x280 [ 0.048539] [ 0.048543] ---[ end trace 0000000000000000 ]--- [ 0.048636] Synchronize counters for CPU 1: done. ...for each but CPU 0/boot. Basic debug printks right before the mentioned line say: [ 0.048170] CPU: 1, smt_mask: So smt_mask, which is sibling mask obviously, is empty when entering the function. This is critical, as sched_core_cpu_starting() calculates core-scheduling parameters only once per CPU start, and it's crucial to have all the parameters filled in at that moment (at least it uses cpu_smt_mask() which in fact is `&cpu_sibling_map[cpu]` on MIPS). A bit of debugging led me to that set_cpu_sibling_map() performing the actual map calculation, was being invocated after notify_cpu_start(), and exactly the latter function starts CPU HP callback round (sched_core_cpu_starting() is basically a CPU HP callback). While the flow is same on ARM64 (maps after the notifier, although before calling set_cpu_online()), x86 started calculating sibling maps earlier than starting the CPU HP callbacks in Linux 4.14 (see [0] for the reference). Neither me nor my brief tests couldn't find any potential caveats in calculating the maps right after performing delay calibration, but the WARN splat is now gone. The very same debug prints now yield exactly what I expected from them: [ 0.048433] CPU: 1, smt_mask: 0-1 [0] https://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git/commit/?id=76ce7cfe35ef Signed-off-by: Alexander Lobakin <alobakin@pm.me> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
* | | | | MIPS: ralink: mt7621: do memory detection on KSEG1Chuanhong Guo2022-02-161-13/+23
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's reported that current memory detection code occasionally detects larger memory under some bootloaders. Current memory detection code tests whether address space wraps around on KSEG0, which is unreliable because it's cached. Rewrite memory size detection to perform the same test on KSEG1 instead. While at it, this patch also does the following two things: 1. use a fixed pattern instead of a random function pointer as the magic value. 2. add an additional memory write and a second comparison as part of the test to prevent possible smaller memory detection result due to leftover values in memory. Fixes: 139c949f7f0a MIPS: ("ralink: mt7621: add memory detection support") Reported-by: Rui Salvaterra <rsalvaterra@gmail.com> Signed-off-by: Chuanhong Guo <gch981213@gmail.com> Tested-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Tested-by: Rui Salvaterra <rsalvaterra@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
* / / / MIPS: DTS: CI20: fix how ddc power is enabledH. Nikolaus Schaller2022-02-091-13/+2
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally we proposed a new hdmi-5v-supply regulator reference for CI20 device tree but that was superseded by a better idea to use the already defined "ddc-en-gpios" property of the "hdmi-connector". Since "MIPS: DTS: CI20: Add DT nodes for HDMI setup" has already been applied to v5.17-rc1, we add this on top. Fixes: ae1b8d2c2de9 ("MIPS: DTS: CI20: Add DT nodes for HDMI setup") Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Reviewed-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
* | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2022-02-051-4/+46
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm fixes from Paolo Bonzini: "ARM: - A couple of fixes when handling an exception while a SError has been delivered - Workaround for Cortex-A510's single-step erratum RISC-V: - Make CY, TM, and IR counters accessible in VU mode - Fix SBI implementation version x86: - Report deprecation of x87 features in supported CPUID - Preparation for fixing an interrupt delivery race on AMD hardware - Sparse fix All except POWER and s390: - Rework guest entry code to correctly mark noinstr areas and fix vtime' accounting (for x86, this was already mostly correct but not entirely; for ARM, MIPS and RISC-V it wasn't)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Use ERR_PTR_USR() to return -EFAULT as a __user pointer KVM: x86: Report deprecated x87 features in supported CPUID KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata KVM: arm64: Stop handle_exit() from handling HVC twice when an SError occurs KVM: arm64: Avoid consuming a stale esr value when SError occur RISC-V: KVM: Fix SBI implementation version RISC-V: KVM: make CY, TM, and IR counters accessible in VU mode kvm/riscv: rework guest entry logic kvm/arm64: rework guest entry logic kvm/x86: rework guest entry logic kvm/mips: rework guest entry logic kvm: add guest_state_{enter,exit}_irqoff() KVM: x86: Move delivery of non-APICv interrupt into vendor code kvm: Move KVM_GET_XSAVE2 IOCTL definition at the end of kvm.h
| * | Merge tag 'kvmarm-fixes-5.17-2' of ↵Paolo Bonzini2022-02-05125-4074/+721
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.17, take #2 - A couple of fixes when handling an exception while a SError has been delivered - Workaround for Cortex-A510's single-step[ erratum
| * | | kvm/mips: rework guest entry logicMark Rutland2022-02-011-4/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In kvm_arch_vcpu_ioctl_run() we use guest_enter_irqoff() and guest_exit_irqoff() directly, with interrupts masked between these. As we don't handle any timer ticks during this window, we will not account time spent within the guest as guest time, which is unfortunate. Additionally, we do not inform lockdep or tracing that interrupts will be enabled during guest execution, which caan lead to misleading traces and warnings that interrupts have been enabled for overly-long periods. This patch fixes these issues by using the new timing and context entry/exit helpers to ensure that interrupts are handled during guest vtime but with RCU watching, with a sequence: guest_timing_enter_irqoff(); guest_state_enter_irqoff(); < run the vcpu > guest_state_exit_irqoff(); < take any pending IRQs > guest_timing_exit_irqoff(); In addition, as guest exits during the "run the vcpu" step are handled by kvm_mips_handle_exit(), a wrapper function is added which ensures that such exists are handled with a sequence: guest_state_exit_irqoff(); < handle the exit > guest_state_enter_irqoff(); This means that exits which stop the vCPU running will have a redundant guest_state_enter_irqoff() .. guest_state_exit_irqoff() sequence, which can be addressed with future rework. Since instrumentation may make use of RCU, we must also ensure that no instrumented code is run during the EQS. I've split out the critical section into a new kvm_mips_enter_exit_vcpu() helper which is marked noinstr. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Message-Id: <20220201132926.3301912-6-mark.rutland@arm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | Merge branch 'kvm-pi-raw-spinlock' into HEADPaolo Bonzini2022-01-194-5/+5
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | Bring in fix for VT-d posted interrupts before further changing the code in 5.17. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | \ \ \ Merge tag 'mips-fixes-5.17_2' of ↵Linus Torvalds2022-02-032-4/+10
|\ \ \ \ \ | |_|_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - fix missed change for PTR->PTR_WD conversion - kernel-doc fixes * tag 'mips-fixes-5.17_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: KVM: fix vz.c kernel-doc notation MIPS: octeon: Fix missed PTR->PTR_WD conversion
| * | | | MIPS: KVM: fix vz.c kernel-doc notationRandy Dunlap2022-02-011-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix all kernel-doc warnings in mips/kvm/vz.c as reported by the kernel test robot: arch/mips/kvm/vz.c:471: warning: Function parameter or member 'out_compare' not described in '_kvm_vz_save_htimer' arch/mips/kvm/vz.c:471: warning: Function parameter or member 'out_cause' not described in '_kvm_vz_save_htimer' arch/mips/kvm/vz.c:471: warning: Excess function parameter 'compare' description in '_kvm_vz_save_htimer' arch/mips/kvm/vz.c:471: warning: Excess function parameter 'cause' description in '_kvm_vz_save_htimer' arch/mips/kvm/vz.c:1551: warning: No description found for return value of 'kvm_trap_vz_handle_cop_unusable' arch/mips/kvm/vz.c:1552: warning: expecting prototype for kvm_trap_vz_handle_cop_unusuable(). Prototype was for kvm_trap_vz_handle_cop_unusable() instead arch/mips/kvm/vz.c:1597: warning: No description found for return value of 'kvm_trap_vz_handle_msa_disabled' Fixes: c992a4f6a9b0 ("KVM: MIPS: Implement VZ support") Fixes: f4474d50c7d4 ("KVM: MIPS/VZ: Support hardware guest timer") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-mips@vger.kernel.org Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Cc: James Hogan <jhogan@kernel.org> Cc: kvm@vger.kernel.org Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
| * | | | MIPS: octeon: Fix missed PTR->PTR_WD conversionThomas Bogendoerfer2022-02-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Fixes: fa62f39dc7e2 ("MIPS: Fix build error due to PTR used in more places") Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
* | | | | Merge tag 'pci-v5.17-fixes-2' of ↵Linus Torvalds2022-01-291-5/+4
|\ \ \ \ \ | |/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci fixes from Bjorn Helgaas: - Fix compilation warnings in new mt7621 driver (Sergio Paracuellos) - Restore the sysfs "rom" file for VGA shadow ROMs, which was broken when converting "rom" to be a static attribute (Bjorn Helgaas) * tag 'pci-v5.17-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/sysfs: Find shadow ROM before static attribute initialization PCI: mt7621: Remove unused function pcie_rmw() PCI: mt7621: Drop of_match_ptr() to avoid unused variable
| * | | | PCI/sysfs: Find shadow ROM before static attribute initializationBjorn Helgaas2022-01-261-5/+4
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ville reported that the sysfs "rom" file for VGA devices disappeared after 527139d738d7 ("PCI/sysfs: Convert "rom" to static attribute"). Prior to 527139d738d7, FINAL fixups, including pci_fixup_video() where we find shadow ROMs, were run before pci_create_sysfs_dev_files() created the sysfs "rom" file. After 527139d738d7, "rom" is a static attribute and is created before FINAL fixups are run, so we didn't create "rom" files for shadow ROMs: acpi_pci_root_add ... pci_scan_single_device pci_device_add pci_fixup_video # <-- new HEADER fixup device_add ... if (grp->is_visible()) pci_dev_rom_attr_is_visible # after 527139d738d7 pci_bus_add_devices pci_bus_add_device pci_fixup_device(pci_fixup_final) pci_fixup_video # <-- previous FINAL fixup pci_create_sysfs_dev_files if (pci_resource_len(pdev, PCI_ROM_RESOURCE)) sysfs_create_bin_file("rom") # before 527139d738d7 Change pci_fixup_video() to be a HEADER fixup so it runs before sysfs static attributes are initialized. Rename the Loongson pci_fixup_radeon() to pci_fixup_video() and make its dmesg logging identical to the others since it is doing the same job. Link: https://lore.kernel.org/r/YbxqIyrkv3GhZVxx@intel.com Fixes: 527139d738d7 ("PCI/sysfs: Convert "rom" to static attribute") Link: https://lore.kernel.org/r/20220126154001.16895-1-helgaas@kernel.org Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v5.13+ Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Krzysztof Wilczyński <kw@linux.com>
* / | | MIPS: Fix build error due to PTR used in more placesThomas Bogendoerfer2022-01-2718-185/+185
|/ / / | | | | | | | | | | | | | | | Use PTR_WD instead of PTR to avoid clashes with other parts. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
* | | Merge tag 'bitmap-5.17-rc1' of git://github.com/norov/linuxLinus Torvalds2022-01-232-2/+0
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull bitmap updates from Yury Norov: - introduce for_each_set_bitrange() - use find_first_*_bit() instead of find_next_*_bit() where possible - unify for_each_bit() macros * tag 'bitmap-5.17-rc1' of git://github.com/norov/linux: vsprintf: rework bitmap_list_string lib: bitmap: add performance test for bitmap_print_to_pagebuf bitmap: unify find_bit operations mm/percpu: micro-optimize pcpu_is_populated() Replace for_each_*_bit_from() with for_each_*_bit() where appropriate find: micro-optimize for_each_{set,clear}_bit() include/linux: move for_each_bit() macros from bitops.h to find.h cpumask: replace cpumask_next_* with cpumask_first_* where appropriate tools: sync tools/bitmap with mother linux all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate cpumask: use find_first_and_bit() lib: add find_first_and_bit() arch: remove GENERIC_FIND_FIRST_BIT entirely include: move find.h from asm_generic to linux bitops: move find_bit_*_le functions from le.h to find.h bitops: protect find_first_{,zero}_bit properly
| * | | arch: remove GENERIC_FIND_FIRST_BIT entirelyYury Norov2022-01-151-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In 5.12 cycle we enabled GENERIC_FIND_FIRST_BIT config option for ARM64 and MIPS. It increased performance and shrunk .text size; and so far I didn't receive any negative feedback on the change. https://lore.kernel.org/linux-arch/20210225135700.1381396-1-yury.norov@gmail.com/ Now I think it's a good time to switch all architectures to use find_{first,last}_bit() unconditionally, and so remove corresponding config option. The patch does't introduce functioal changes for arc, arm, arm64, mips, m68k, s390 and x86, for other architectures I expect improvement both in performance and .text size. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Alexander Lobakin <alobakin@pm.me> (mips) Reviewed-by: Alexander Lobakin <alobakin@pm.me> (mips) Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Will Deacon <will@kernel.org> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
| * | | include: move find.h from asm_generic to linuxYury Norov2022-01-151-1/+0
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | find_bit API and bitmap API are closely related, but inclusion paths are different - include/asm-generic and include/linux, correspondingly. In the past it made a lot of troubles due to circular dependencies and/or undefined symbols. Fix this by moving find.h under include/linux. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
* | | Merge branch 'akpm' (patches from Andrew)Linus Torvalds2022-01-202-19/+5
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge more updates from Andrew Morton: "55 patches. Subsystems affected by this patch series: percpu, procfs, sysctl, misc, core-kernel, get_maintainer, lib, checkpatch, binfmt, nilfs2, hfs, fat, adfs, panic, delayacct, kconfig, kcov, and ubsan" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (55 commits) lib: remove redundant assignment to variable ret ubsan: remove CONFIG_UBSAN_OBJECT_SIZE kcov: fix generic Kconfig dependencies if ARCH_WANTS_NO_INSTR lib/Kconfig.debug: make TEST_KMOD depend on PAGE_SIZE_LESS_THAN_256KB btrfs: use generic Kconfig option for 256kB page size limit arch/Kconfig: split PAGE_SIZE_LESS_THAN_256KB from PAGE_SIZE_LESS_THAN_64KB configs: introduce debug.config for CI-like setup delayacct: track delays from memory compact Documentation/accounting/delay-accounting.rst: add thrashing page cache and direct compact delayacct: cleanup flags in struct task_delay_info and functions use it delayacct: fix incomplete disable operation when switch enable to disable delayacct: support swapin delay accounting for swapping without blkio panic: remove oops_id panic: use error_report_end tracepoint on warnings fs/adfs: remove unneeded variable make code cleaner FAT: use io_schedule_timeout() instead of congestion_wait() hfsplus: use struct_group_attr() for memcpy() region nilfs2: remove redundant pointer sbufs fs/binfmt_elf: use PT_LOAD p_align values for static PIE const_structs.checkpatch: add frequently used ops structs ...
| * | | mm: percpu: add generic pcpu_fc_alloc/free funcitonKefeng Wang2022-01-201-15/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the previous patch, we could add a generic pcpu first chunk allocate and free function to cleanup the duplicated definations on each architecture. Link: https://lkml.kernel.org/r/20211216112359.103822-4-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Dennis Zhou <dennis@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | mm: percpu: add pcpu_fc_cpu_to_node_fn_t typedefKefeng Wang2022-01-201-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add pcpu_fc_cpu_to_node_fn_t and pass it into pcpu_fc_alloc_fn_t, pcpu first chunk allocation will call it to alloc memblock on the corresponding node by it, this is prepare for the next patch. Link: https://lkml.kernel.org/r/20211216112359.103822-3-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Dennis Zhou <dennis@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | mm: percpu: generalize percpu related configKefeng Wang2022-01-201-8/+2
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "mm: percpu: Cleanup percpu first chunk function". When supporting page mapping percpu first chunk allocator on arm64, we found there are lots of duplicated codes in percpu embed/page first chunk allocator. This patchset is aimed to cleanup them and should no function change. The currently supported status about 'embed' and 'page' in Archs shows below, embed: NEED_PER_CPU_PAGE_FIRST_CHUNK page: NEED_PER_CPU_EMBED_FIRST_CHUNK embed page ------------------------ arm64 Y Y mips Y N powerpc Y Y riscv Y N sparc Y Y x86 Y Y ------------------------ There are two interfaces about percpu first chunk allocator, extern int __init pcpu_embed_first_chunk(size_t reserved_size, size_t dyn_size, size_t atom_size, pcpu_fc_cpu_distance_fn_t cpu_distance_fn, - pcpu_fc_alloc_fn_t alloc_fn, - pcpu_fc_free_fn_t free_fn); + pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn); extern int __init pcpu_page_first_chunk(size_t reserved_size, - pcpu_fc_alloc_fn_t alloc_fn, - pcpu_fc_free_fn_t free_fn, - pcpu_fc_populate_pte_fn_t populate_pte_fn); + pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn); The pcpu_fc_alloc_fn_t/pcpu_fc_free_fn_t is killed, we provide generic pcpu_fc_alloc() and pcpu_fc_free() function, which are called in the pcpu_embed/page_first_chunk(). 1) For pcpu_embed_first_chunk(), pcpu_fc_cpu_to_node_fn_t is needed to be provided when archs supported NUMA. 2) For pcpu_page_first_chunk(), the pcpu_fc_populate_pte_fn_t is killed too, a generic pcpu_populate_pte() which marked '__weak' is provided, if you need a different function to populate pte on the arch(like x86), please provide its own implementation. [1] https://github.com/kevin78/linux.git percpu-cleanup This patch (of 4): The HAVE_SETUP_PER_CPU_AREA/NEED_PER_CPU_EMBED_FIRST_CHUNK/ NEED_PER_CPU_PAGE_FIRST_CHUNK/USE_PERCPU_NUMA_NODE_ID configs, which have duplicate definitions on platforms that subscribe it. Move them into mm, drop these redundant definitions and instead just select it on applicable platforms. Link: https://lkml.kernel.org/r/20211216112359.103822-1-wangkefeng.wang@huawei.com Link: https://lkml.kernel.org/r/20211216112359.103822-2-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Cc: Will Deacon <will@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge tag 'kbuild-v5.17' of ↵Linus Torvalds2022-01-191-6/+6
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Add new kconfig target 'make mod2noconfig', which will be useful to speed up the build and test iteration. - Raise the minimum supported version of LLVM to 11.0.0 - Refactor certs/Makefile - Change the format of include/config/auto.conf to stop double-quoting string type CONFIG options. - Fix ARCH=sh builds in dash - Separate compression macros for general purposes (cmd_bzip2 etc.) and the ones for decompressors (cmd_bzip2_with_size etc.) - Misc Makefile cleanups * tag 'kbuild-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (34 commits) kbuild: add cmd_file_size arch: decompressor: remove useless vmlinux.bin.all-y kbuild: rename cmd_{bzip2,lzma,lzo,lz4,xzkern,zstd22} kbuild: drop $(size_append) from cmd_zstd sh: rename suffix-y to suffix_y doc: kbuild: fix default in `imply` table microblaze: use built-in function to get CPU_{MAJOR,MINOR,REV} certs: move scripts/extract-cert to certs/ kbuild: do not quote string values in include/config/auto.conf kbuild: do not include include/config/auto.conf from shell scripts certs: simplify $(srctree)/ handling and remove config_filename macro kbuild: stop using config_filename in scripts/Makefile.modsign certs: remove misleading comments about GCC PR certs: refactor file cleaning certs: remove unneeded -I$(srctree) option for system_certificates.o certs: unify duplicated cmd_extract_certs and improve the log certs: use $< and $@ to simplify the key generation rule kbuild: remove headers_check stub kbuild: move headers_check.pl to usr/include/ certs: use if_changed to re-generate the key when the key type is changed ...
| * | | kbuild: rename cmd_{bzip2,lzma,lzo,lz4,xzkern,zstd22}Masahiro Yamada2022-01-141-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GZIP-compressed files end with 4 byte data that represents the size of the original input. The decompressors (the self-extracting kernel) exploit it to know the vmlinux size beforehand. To mimic the GZIP's trailer, Kbuild provides cmd_{bzip2,lzma,lzo,lz4,xzkern,zstd22}. Unfortunately these macros are used everywhere despite the appended size data is only useful for the decompressors. There is no guarantee that such hand-crafted trailers are safely ignored. In fact, the kernel refuses compressed initramdfs with the garbage data. That is why usr/Makefile overrides size_append to make it no-op. To limit the use of such broken compressed files, this commit renames the existing macros as follows: cmd_bzip2 --> cmd_bzip2_with_size cmd_lzma --> cmd_lzma_with_size cmd_lzo --> cmd_lzo_with_size cmd_lz4 --> cmd_lz4_with_size cmd_xzkern --> cmd_xzkern_with_size cmd_zstd22 --> cmd_zstd22_with_size To keep the decompressors working, I updated the following Makefiles accordingly: arch/arm/boot/compressed/Makefile arch/h8300/boot/compressed/Makefile arch/mips/boot/compressed/Makefile arch/parisc/boot/compressed/Makefile arch/s390/boot/compressed/Makefile arch/sh/boot/compressed/Makefile arch/x86/boot/compressed/Makefile I reused the current macro names for the normal usecases; they produce the compressed data in the proper format. I did not touch the following: arch/arc/boot/Makefile arch/arm64/boot/Makefile arch/csky/boot/Makefile arch/mips/boot/Makefile arch/riscv/boot/Makefile arch/sh/boot/Makefile kernel/Makefile This means those Makefiles will stop appending the size data. I dropped the 'override size_append' hack from usr/Makefile. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <n.schier@avm.de>
* | | | Merge tag 'linux-watchdog-5.17-rc1' of ↵Linus Torvalds2022-01-171-0/+8
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://www.linux-watchdog.org/linux-watchdog Pull watchdog updates from Wim Van Sebroeck: - New device support: - Watchdog Timer driver for RZ/G2L - Realtek Otto watchdog timer - Apple SoC watchdog driver - Fintek F81966 - Remove BCM63XX_WDT after support for this SoC was added to BCM7038_WDT - Improvements of the BCM7038_WDT and s3c2410_wdt code - Several other fixes and improvements * tag 'linux-watchdog-5.17-rc1' of git://www.linux-watchdog.org/linux-watchdog: (38 commits) watchdog: msc313e: Check if the WDT was running at boot watchdog: Add Apple SoC watchdog driver dt-bindings: watchdog: Add SM6350 and SM8250 compatible watchdog: s3c2410: Fix getting the optional clock watchdog: s3c2410: Use platform_get_irq() to get the interrupt dt-bindings: watchdog: atmel: Add missing 'interrupts' property watchdog: mtk_wdt: use platform_get_irq_optional watchdog: Add Watchdog Timer driver for RZ/G2L dt-bindings: watchdog: renesas,wdt: Add support for RZ/G2L watchdog: da9063: Add hard dependency on I2C watchdog: Add Realtek Otto watchdog timer dt-bindings: watchdog: Realtek Otto WDT binding watchdog: s3c2410: Add Exynos850 support watchdog: da9063: use atomic safe i2c transfer in reset handler watchdog: davinci: Use div64_ul instead of do_div watchdog: Remove BCM63XX_WDT MIPS: BCM63XX: Provide platform data to watchdog device watchdog: bcm7038_wdt: Add platform device id for bcm63xx-wdt watchdog: Allow building BCM7038_WDT for BCM63XX watchdog: bcm7038_wdt: Support platform data configuration ...
| * | | | MIPS: BCM63XX: Provide platform data to watchdog deviceFlorian Fainelli2021-12-281-0/+8
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to utilize the bcm7038_wdt.c driver which needs to know the clock name to obtain, pass it via platform data using the bcm7038_wdt_platform_data structure. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20211112224636.395101-7-f.fainelli@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
* | | | Merge branch 'signal-for-v5.17' of ↵Linus Torvalds2022-01-171-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull signal/exit/ptrace updates from Eric Biederman: "This set of changes deletes some dead code, makes a lot of cleanups which hopefully make the code easier to follow, and fixes bugs found along the way. The end-game which I have not yet reached yet is for fatal signals that generate coredumps to be short-circuit deliverable from complete_signal, for force_siginfo_to_task not to require changing userspace configured signal delivery state, and for the ptrace stops to always happen in locations where we can guarantee on all architectures that the all of the registers are saved and available on the stack. Removal of profile_task_ext, profile_munmap, and profile_handoff_task are the big successes for dead code removal this round. A bunch of small bug fixes are included, as most of the issues reported were small enough that they would not affect bisection so I simply added the fixes and did not fold the fixes into the changes they were fixing. There was a bug that broke coredumps piped to systemd-coredump. I dropped the change that caused that bug and replaced it entirely with something much more restrained. Unfortunately that required some rebasing. Some successes after this set of changes: There are few enough calls to do_exit to audit in a reasonable amount of time. The lifetime of struct kthread now matches the lifetime of struct task, and the pointer to struct kthread is no longer stored in set_child_tid. The flag SIGNAL_GROUP_COREDUMP is removed. The field group_exit_task is removed. Issues where task->exit_code was examined with signal->group_exit_code should been examined were fixed. There are several loosely related changes included because I am cleaning up and if I don't include them they will probably get lost. The original postings of these changes can be found at: https://lkml.kernel.org/r/87a6ha4zsd.fsf@email.froward.int.ebiederm.org https://lkml.kernel.org/r/87bl1kunjj.fsf@email.froward.int.ebiederm.org https://lkml.kernel.org/r/87r19opkx1.fsf_-_@email.froward.int.ebiederm.org I trimmed back the last set of changes to only the obviously correct once. Simply because there was less time for review than I had hoped" * 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (44 commits) ptrace/m68k: Stop open coding ptrace_report_syscall ptrace: Remove unused regs argument from ptrace_report_syscall ptrace: Remove second setting of PT_SEIZED in ptrace_attach taskstats: Cleanup the use of task->exit_code exit: Use the correct exit_code in /proc/<pid>/stat exit: Fix the exit_code for wait_task_zombie exit: Coredumps reach do_group_exit exit: Remove profile_handoff_task exit: Remove profile_task_exit & profile_munmap signal: clean up kernel-doc comments signal: Remove the helper signal_group_exit signal: Rename group_exit_task group_exec_task coredump: Stop setting signal->group_exit_task signal: Remove SIGNAL_GROUP_COREDUMP signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process signal: Make coredump handling explicit in complete_signal signal: Have prepare_signal detect coredumps using signal->core_state signal: Have the oom killer detect coredumps using signal->core_state exit: Move force_uaccess back into do_exit exit: Guarantee make_task_dead leaks the tsk when calling do_task_exit ...
| * | | | exit: Add and use make_task_dead.Eric W. Biederman2021-12-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two big uses of do_exit. The first is it's design use to be the guts of the exit(2) system call. The second use is to terminate a task after something catastrophic has happened like a NULL pointer in kernel code. Add a function make_task_dead that is initialy exactly the same as do_exit to cover the cases where do_exit is called to handle catastrophic failure. In time this can probably be reduced to just a light wrapper around do_task_dead. For now keep it exactly the same so that there will be no behavioral differences introducing this new concept. Replace all of the uses of do_exit that use it for catastraphic task cleanup with make_task_dead to make it clear what the code is doing. As part of this rename rewind_stack_do_exit rewind_stack_and_make_dead. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2022-01-166-33/+10
|\ \ \ \ \ | | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm updates from Paolo Bonzini: "RISCV: - Use common KVM implementation of MMU memory caches - SBI v0.2 support for Guest - Initial KVM selftests support - Fix to avoid spurious virtual interrupts after clearing hideleg CSR - Update email address for Anup and Atish ARM: - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update s390: - fix sigp sense/start/stop/inconsistency - cleanups x86: - Clean up some function prototypes more - improved gfn_to_pfn_cache with proper invalidation, used by Xen emulation - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery - completely remove potential TOC/TOU races in nested SVM consistency checks - update some PMCs on emulated instructions - Intel AMX support (joint work between Thomas and Intel) - large MMU cleanups - module parameter to disable PMU virtualization - cleanup register cache - first part of halt handling cleanups - Hyper-V enlightened MSR bitmap support for nested hypervisors Generic: - clean up Makefiles - introduce CONFIG_HAVE_KVM_DIRTY_RING - optimize memslot lookup using a tree - optimize vCPU array usage by converting to xarray" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (268 commits) x86/fpu: Fix inline prefix warnings selftest: kvm: Add amx selftest selftest: kvm: Move struct kvm_x86_state to header selftest: kvm: Reorder vcpu_load_state steps for AMX kvm: x86: Disable interception for IA32_XFD on demand x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() kvm: selftests: Add support for KVM_CAP_XSAVE2 kvm: x86: Add support for getting/setting expanded xstate buffer x86/fpu: Add uabi_size to guest_fpu kvm: x86: Add CPUID support for Intel AMX kvm: x86: Add XCR0 support for Intel AMX kvm: x86: Disable RDMSR interception of IA32_XFD_ERR kvm: x86: Emulate IA32_XFD_ERR for guest kvm: x86: Intercept #NM for saving IA32_XFD_ERR x86/fpu: Prepare xfd_err in struct fpu_guest kvm: x86: Add emulation for IA32_XFD x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation kvm: x86: Enable dynamic xfeatures at KVM_SET_CPUID2 x86/fpu: Provide fpu_enable_guest_xfd_features() for KVM x86/fpu: Add guest support to xfd_enable_feature() ...
| * | | | KVM: mips: Use Makefile.kvm for common filesDavid Woodhouse2021-12-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211121125451.9489-5-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | Merge branch 'kvm-on-hv-msrbm-fix' into HEADPaolo Bonzini2021-12-085-7/+5
| |\ \ \ \ | | | |_|/ | | |/| | | | | | | | | | | | Merge bugfix for enlightened MSR Bitmap, before adding support to KVM for exposing the feature to nested guests.
| * | | | KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt()Sean Christopherson2021-12-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename kvm_vcpu_block() to kvm_vcpu_halt() in preparation for splitting the actual "block" sequences into a separate helper (to be named kvm_vcpu_block()). x86 will use the standalone block-only path to handle non-halt cases where the vCPU is not runnable. Rename block_ns to halt_ns to match the new function name. No functional change intended. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-14-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: Drop obsolete kvm_arch_vcpu_block_finish()Sean Christopherson2021-12-081-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop kvm_arch_vcpu_block_finish() now that all arch implementations are nops. No functional change intended. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: Use interval tree to do fast hva lookup in memslotsMaciej S. Szmigiero2021-12-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current memslots implementation only allows quick binary search by gfn, quick lookup by hva is not possible - the implementation has to do a linear scan of the whole memslots array, even though the operation being performed might apply just to a single memslot. This significantly hurts performance of per-hva operations with higher memslot counts. Since hva ranges can overlap between memslots an interval tree is needed for tracking them. [sean: handle interval tree updates in kvm_replace_memslot()] Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <d66b9974becaa9839be9c4e1a5de97b177b4ac20.1638817640.git.maciej.szmigiero@oracle.com>
| * | | | KVM: Stop passing kvm_userspace_memory_region to arch memslot hooksSean Christopherson2021-12-081-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop the @mem param from kvm_arch_{prepare,commit}_memory_region() now that its use has been removed in all architectures. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <aa5ed3e62c27e881d0d8bc0acbc1572bc336dc19.1638817640.git.maciej.szmigiero@oracle.com>
| * | | | KVM: MIPS: Drop pr_debug from memslot commit to avoid using "mem"Sean Christopherson2021-12-081-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove an old (circa 2012) kvm_debug from kvm_arch_commit_memory_region() to print basic information when committing a memslot change. The primary motivation for removing the kvm_debug is to avoid using @mem, the user memory region, so that said param can be removed. Alternatively, the debug message could be converted to use @new, but that would require synthesizing select state to play nice with the DELETED case, which will pass NULL for @new in the future. And there's no argument to be had for dumping generic information in an arch callback, i.e. if there's a good reason for the debug message, then it belongs in common KVM code where all architectures can benefit. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <446929a668f6e1346751571b71db41e94e976cdf.1638817639.git.maciej.szmigiero@oracle.com>
| * | | | KVM: Let/force architectures to deal with arch specific memslot dataSean Christopherson2021-12-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass the "old" slot to kvm_arch_prepare_memory_region() and force arch code to handle propagating arch specific data from "new" to "old" when necessary. This is a baby step towards dynamically allocating "new" from the get go, and is a (very) minor performance boost on x86 due to not unnecessarily copying arch data. For PPC HV, copy the rmap in the !CREATE and !DELETE paths, i.e. for MOVE and FLAGS_ONLY. This is functionally a nop as the previous behavior would overwrite the pointer for CREATE, and eventually discard/ignore it for DELETE. For x86, copy the arch data only for FLAGS_ONLY changes. Unlike PPC HV, x86 needs to reallocate arch data in the MOVE case as the size of x86's allocations depend on the alignment of the memslot's gfn. Opportunistically tweak kvm_arch_prepare_memory_region()'s param order to match the "commit" prototype. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> [mss: add missing RISCV kvm_arch_prepare_memory_region() change] Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <67dea5f11bbcfd71e3da5986f11e87f5dd4013f9.1638817639.git.maciej.szmigiero@oracle.com>
| * | | | KVM: mips: Use kvm_get_vcpu() instead of open-coded accessMarc Zyngier2021-12-082-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we are about to change the way vcpus are allocated, mandate the use of kvm_get_vcpu() instead of open-coding the access. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Message-Id: <20211116160403.4074052-3-maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: Move wiping of the kvm->vcpus array to common codeMarc Zyngier2021-12-081-20/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All architectures have similar loops iterating over the vcpus, freeing one vcpu at a time, and eventually wiping the reference off the vcpus array. They are also inconsistently taking the kvm->lock mutex when wiping the references from the array. Make this code common, which will simplify further changes. The locking is dropped altogether, as this should only be called when there is no further references on the kvm structure. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Message-Id: <20211116160403.4074052-2-maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>