summaryrefslogtreecommitdiffstats
path: root/arch/loongarch/kernel/process.c
Commit message (Collapse)AuthorAgeFilesLines
* LoongArch: Fix and simplify fcsr initialization on execve()Xi Ruoyao2024-01-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There has been a lingering bug in LoongArch Linux systems causing some GCC tests to intermittently fail (see Closes link). I've made a minimal reproducer: zsh% cat measure.s .align 4 .globl _start _start: movfcsr2gr $a0, $fcsr0 bstrpick.w $a0, $a0, 16, 16 beqz $a0, .ok break 0 .ok: li.w $a7, 93 syscall 0 zsh% cc mesaure.s -o measure -nostdlib zsh% echo $((1.0/3)) 0.33333333333333331 zsh% while ./measure; do ; done This while loop should not stop as POSIX is clear that execve must set fenv to the default, where FCSR should be zero. But in fact it will just stop after running for a while (normally less than 30 seconds). Note that "$((1.0/3))" is needed to reproduce this issue because it raises FE_INVALID and makes fcsr0 non-zero. The problem is we are currently relying on SET_PERSONALITY2() to reset current->thread.fpu.fcsr. But SET_PERSONALITY2() is executed before start_thread which calls lose_fpu(0). We can see if kernel preempt is enabled, we may switch to another thread after SET_PERSONALITY2() but before lose_fpu(0). Then bad thing happens: during the thread switch the value of the fcsr0 register is stored into current->thread.fpu.fcsr, making it dirty again. The issue can be fixed by setting current->thread.fpu.fcsr after lose_fpu(0) because lose_fpu() clears TIF_USEDFPU, then the thread switch won't touch current->thread.fpu.fcsr. The only other architecture setting FCSR in SET_PERSONALITY2() is MIPS. I've ran a similar test on MIPS with mainline kernel and it turns out MIPS is buggy, too. Anyway MIPS do this for supporting different FP flavors (NaN encodings, etc.) which do not exist on LoongArch. So for LoongArch, we can simply remove the current->thread.fpu.fcsr setting from SET_PERSONALITY2() and do it in start_thread(), after lose_fpu(0). The while loop failing with the mainline kernel has survived one hour after this change on LoongArch. Fixes: 803b0fc5c3f2baa ("LoongArch: Add process management") Closes: https://github.com/loongson-community/discussions/issues/7 Link: https://lore.kernel.org/linux-mips/7a6aa1bbdbbe2e63ae96ff163fab0349f58f1b9e.camel@xry111.site/ Cc: stable@vger.kernel.org Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* LoongArch: Fix some build warnings with W=1Bibo Mao2023-09-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are some building warnings when building LoongArch kernel with W=1 as following, this patch fixes them. arch/loongarch/kernel/acpi.c:284:13: warning: no previous prototype for ‘acpi_numa_arch_fixup’ [-Wmissing-prototypes] 284 | void __init acpi_numa_arch_fixup(void) {} | ^~~~~~~~~~~~~~~~~~~~ arch/loongarch/kernel/time.c:32:13: warning: no previous prototype for ‘constant_timer_interrupt’ [-Wmissing-prototypes] 32 | irqreturn_t constant_timer_interrupt(int irq, void *data) | ^~~~~~~~~~~~~~~~~~~~~~~~ arch/loongarch/kernel/traps.c:496:25: warning: no previous prototype for 'do_fpe' [-Wmissing-prototypes] 496 | asmlinkage void noinstr do_fpe(struct pt_regs *regs | ^~~~~~ arch/loongarch/kernel/traps.c:813:22: warning: variable ‘opcode’ set but not used [-Wunused-but-set-variable] 813 | unsigned int opcode; | ^~~~~~ arch/loongarch/kernel/signal.c:895:14: warning: no previous prototype for ‘get_sigframe’ [-Wmissing-prototypes] 895 | void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs, | ^~~~~~~~~~~~ arch/loongarch/kernel/syscall.c:21:40: warning: initialized field overwritten [-Woverride-init] 21 | #define __SYSCALL(nr, call) [nr] = (call), | ^ arch/loongarch/kernel/syscall.c:40:14: warning: no previous prototype for ‘do_syscall’ [-Wmissing-prototypes] 40 | void noinstr do_syscall(struct pt_regs *regs) | ^~~~~~~~~~ arch/loongarch/kernel/smp.c:502:17: warning: no previous prototype for ‘start_secondary’ [-Wmissing-prototypes] 502 | asmlinkage void start_secondary(void) | ^~~~~~~~~~~~~~~ arch/loongarch/kernel/process.c:309:15: warning: no previous prototype for ‘arch_align_stack’ [-Wmissing-prototypes] 309 | unsigned long arch_align_stack(unsigned long sp) | ^~~~~~~~~~~~~~~~ arch/loongarch/kernel/topology.c:13:5: warning: no previous prototype for ‘arch_register_cpu’ [-Wmissing-prototypes] 13 | int arch_register_cpu(int cpu) | ^~~~~~~~~~~~~~~~~ arch/loongarch/kernel/topology.c:27:6: warning: no previous prototype for ‘arch_unregister_cpu’ [-Wmissing-prototypes] 27 | void arch_unregister_cpu(int cpu) | ^~~~~~~~~~~~~~~~~~~ arch/loongarch/kernel/module-sections.c:103:5: warning: no previous prototype for ‘module_frob_arch_sections’ [-Wmissing-prototypes] 103 | int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, | ^~~~~~~~~~~~~~~~~~~~~~~~~ arch/loongarch/mm/hugetlbpage.c:56:5: warning: no previous prototype for ‘is_aligned_hugepage_range’ [-Wmissing-prototypes] 56 | int is_aligned_hugepage_range(unsigned long addr, unsigned long len) | ^~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* Merge tag 'loongarch-6.6' of ↵Linus Torvalds2023-09-081-3/+12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Allow usage of LSX/LASX in the kernel, and use them for SIMD-optimized RAID5/RAID6 routines - Add Loongson Binary Translation (LBT) extension support - Add basic KGDB & KDB support - Add building with kcov coverage - Add KFENCE (Kernel Electric-Fence) support - Add KASAN (Kernel Address Sanitizer) support - Some bug fixes and other small changes - Update the default config file * tag 'loongarch-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (25 commits) LoongArch: Update Loongson-3 default config file LoongArch: Add KASAN (Kernel Address Sanitizer) support LoongArch: Simplify the processing of jumping new kernel for KASLR kasan: Add (pmd|pud)_init for LoongArch zero_(pud|p4d)_populate process kasan: Add __HAVE_ARCH_SHADOW_MAP to support arch specific mapping LoongArch: Add KFENCE (Kernel Electric-Fence) support LoongArch: Get partial stack information when providing regs parameter LoongArch: mm: Add page table mapped mode support for virt_to_page() kfence: Defer the assignment of the local variable addr LoongArch: Allow building with kcov coverage LoongArch: Provide kaslr_offset() to get kernel offset LoongArch: Add basic KGDB & KDB support LoongArch: Add Loongson Binary Translation (LBT) extension support raid6: Add LoongArch SIMD recovery implementation raid6: Add LoongArch SIMD syndrome calculation LoongArch: Add SIMD-optimized XOR routines LoongArch: Allow usage of LSX/LASX in the kernel LoongArch: Define symbol 'fault' as a local label in fpu.S LoongArch: Adjust {copy, clear}_user exception handler behavior LoongArch: Use static defined zero page rather than allocated ...
| * LoongArch: Add Loongson Binary Translation (LBT) extension supportQi Hu2023-09-061-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | Loongson Binary Translation (LBT) is used to accelerate binary translation, which contains 4 scratch registers (scr0 to scr3), x86/ARM eflags (eflags) and x87 fpu stack pointer (ftop). This patch support kernel to save/restore these registers, handle the LBT exception and maintain sigcontext. Signed-off-by: Qi Hu <huqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* | Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of ↵Linus Torvalds2023-08-291-2/+2
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - An extensive rework of kexec and crash Kconfig from Eric DeVolder ("refactor Kconfig to consolidate KEXEC and CRASH options") - kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a couple of macros to args.h") - gdb feature work from Kuan-Ying Lee ("Add GDB memory helper commands") - vsprintf inclusion rationalization from Andy Shevchenko ("lib/vsprintf: Rework header inclusions") - Switch the handling of kdump from a udev scheme to in-kernel handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory hot un/plug") - Many singleton patches to various parts of the tree * tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits) document while_each_thread(), change first_tid() to use for_each_thread() drivers/char/mem.c: shrink character device's devlist[] array x86/crash: optimize CPU changes crash: change crash_prepare_elf64_headers() to for_each_possible_cpu() crash: hotplug support for kexec_load() x86/crash: add x86 crash hotplug support crash: memory and CPU hotplug sysfs attributes kexec: exclude elfcorehdr from the segment digest crash: add generic infrastructure for crash hotplug support crash: move a few code bits to setup support of crash hotplug kstrtox: consistently use _tolower() kill do_each_thread() nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse scripts/bloat-o-meter: count weak symbol sizes treewide: drop CONFIG_EMBEDDED lockdep: fix static memory detection even more lib/vsprintf: declare no_hash_pointers in sprintf.h lib/vsprintf: split out sprintf() and friends kernel/fork: stop playing lockless games for exe_file replacement adfs: delete unused "union adfs_dirtail" definition ...
| * nmi_backtrace: allow excluding an arbitrary CPUDouglas Anderson2023-08-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The APIs that allow backtracing across CPUs have always had a way to exclude the current CPU. This convenience means callers didn't need to find a place to allocate a CPU mask just to handle the common case. Let's extend the API to take a CPU ID to exclude instead of just a boolean. This isn't any more complex for the API to handle and allows the hardlockup detector to exclude a different CPU (the one it already did a trace for) without needing to find space for a CPU mask. Arguably, this new API also encourages safer behavior. Specifically if the caller wants to avoid tracing the current CPU (maybe because they already traced the current CPU) this makes it more obvious to the caller that they need to make sure that the current CPU ID can't change. [akpm@linux-foundation.org: fix trigger_allbutcpu_cpu_backtrace() stub] Link: https://lkml.kernel.org/r/20230804065935.v4.1.Ia35521b91fc781368945161d7b28538f9996c182@changeid Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: kernel test robot <lkp@intel.com> Cc: Lecopzer Chen <lecopzer.chen@mediatek.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Pingfan Liu <kernelfans@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | LoongArch: Put the body of play_dead() into arch_cpu_idle_dead()Tiezhu Yang2023-08-251-7/+0
|/ | | | | | | | | | | | | | | | | | | | | | The initial aim is to silence the following objtool warning: arch/loongarch/kernel/process.o: warning: objtool: arch_cpu_idle_dead() falls through to next function start_thread() According to tools/objtool/Documentation/objtool.txt, this is because the last instruction of arch_cpu_idle_dead() is a call to a noreturn function play_dead(). In order to silence the warning, one simple way is to add the noreturn function play_dead() to objtool's hard-coded global_noreturns array, that is to say, just put "NORETURN(play_dead)" into tools/objtool/noreturns.h, it works well. But I noticed that play_dead() is only defined once and only called by arch_cpu_idle_dead(), so put the body of play_dead() into the caller arch_cpu_idle_dead(), then remove the noreturn function play_dead() is an alternative way which can reduce the overhead of the function call at the same time. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* LoongArch: Add vector extensions supportHuacai Chen2023-06-291-2/+8
| | | | | | | | | | | | | | | | | | | | | Add LoongArch's vector extensions support, which including 128bit LSX (i.e., Loongson SIMD eXtension) and 256bit LASX (i.e., Loongson Advanced SIMD eXtension). Linux kernel doesn't use vector itself, it only handle exceptions and context save/restore. So it only needs a subset of these instructions: * Vector load/store: vld vst vldx vstx xvld xvst xvldx xvstx * 8bit-elements move: vpickve2gr.b xvpickve2gr.b vinsgr2vr.b xvinsgr2vr.b * 16bit-elements move: vpickve2gr.h xvpickve2gr.h vinsgr2vr.h xvinsgr2vr.h * 32bit-elements move: vpickve2gr.w xvpickve2gr.w vinsgr2vr.w xvinsgr2vr.w * 64bit-elements move: vpickve2gr.d xvpickve2gr.d vinsgr2vr.d xvinsgr2vr.d * Elements permute: vpermi.w vpermi.d xvpermi.w xvpermi.d xvpermi.q Introduce AS_HAS_LSX_EXTENSION and AS_HAS_LASX_EXTENSION to avoid non- vector toolchains complains unsupported instructions. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* LoongArch: Add support to clone a time namespaceTiezhu Yang2023-06-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can see that "Time namespaces are not supported" on LoongArch: (1) clone3 test # cd tools/testing/selftests/clone3 && make && ./clone3 ... # Time namespaces are not supported ok 18 # SKIP Skipping clone3() with CLONE_NEWTIME # Totals: pass:17 fail:0 xfail:0 xpass:0 skip:1 error:0 (2) timens test # cd tools/testing/selftests/timens && make && ./timens ... 1..0 # SKIP Time namespaces are not supported On LoongArch the current kernel does not support CONFIG_TIME_NS which depends on GENERIC_VDSO_TIME_NS, select GENERIC_VDSO_TIME_NS to enable CONFIG_TIME_NS to build kernel/time/namespace.c. Additionally, it needs to define some arch-dependent functions for the timens, such as __arch_get_timens_vdso_data(), arch_get_vdso_data() and vdso_join_timens(). At the same time, modify the layout of vvar to use one page size for generic vdso data, expand another page size for timens vdso data and assign LOONGARCH_VDSO_DATA_SIZE (maybe exceeds a page size if expand in the future) for loongarch vdso data, at last add the callback function vvar_fault() and modify stack_top(). With this patch under CONFIG_TIME_NS: (1) clone3 test # cd tools/testing/selftests/clone3 && make && ./clone3 ... ok 18 [739] Result (0) matches expectation (0) # Totals: pass:18 fail:0 xfail:0 xpass:0 skip:0 error:0 (2) timens test # cd tools/testing/selftests/timens && make && ./timens ... # Totals: pass:10 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* sched/idle: Mark arch_cpu_idle_dead() __noreturnJosh Poimboeuf2023-03-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Before commit 076cbf5d2163 ("x86/xen: don't let xen_pv_play_dead() return"), in Xen, when a previously offlined CPU was brought back online, it unexpectedly resumed execution where it left off in the middle of the idle loop. There were some hacks to make that work, but the behavior was surprising as do_idle() doesn't expect an offlined CPU to return from the dead (in arch_cpu_idle_dead()). Now that Xen has been fixed, and the arch-specific implementations of arch_cpu_idle_dead() also don't return, give it a __noreturn attribute. This will cause the compiler to complain if an arch-specific implementation might return. It also improves code generation for both caller and callee. Also fixes the following warning: vmlinux.o: warning: objtool: do_idle+0x25f: unreachable instruction Reported-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/60d527353da8c99d4cf13b6473131d46719ed16d.1676358308.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
* LoongArch: Add hardware breakpoints/watchpoints supportQing Zhang2023-02-251-0/+7
| | | | | | | | | | | | | | | | | | | Use perf framework to manage hardware instruction and data breakpoints. LoongArch defines hardware watchpoint functions for instruction fetch and memory load/store operations. After the software configures hardware watchpoints, the processor hardware will monitor the access address of the instruction fetch and load/store operation, and trigger an exception of the watchpoint when it meets the conditions set by the watchpoint. The hardware monitoring points for instruction fetching and load/store operations each have a register for the overall configuration of all monitoring points, a register for recording the status of all monitoring points, and four registers required for configuration of each watchpoint individually. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* LoongArch: Get frame info in unwind_start() when regs is not availableJinyang He2023-01-171-9/+3
| | | | | | | | | At unwind_start(), it is better to get its frame info here rather than get them outside, even we don't have 'regs'. In this way we can simply use unwind_{start, next_frame, done} outside. Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* Merge tag 'loongarch-6.2' of ↵Linus Torvalds2022-12-191-0/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Switch to relative exception tables - Add unaligned access support - Add alternative runtime patching mechanism - Add FDT booting support from efi system table - Add suspend/hibernation (ACPI S3/S4) support - Add basic STACKPROTECTOR support - Add ftrace (function tracer) support - Update the default config file * tag 'loongarch-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (24 commits) LoongArch: Update Loongson-3 default config file LoongArch: modules/ftrace: Initialize PLT at load time LoongArch/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support LoongArch/ftrace: Add HAVE_DYNAMIC_FTRACE_WITH_ARGS support LoongArch/ftrace: Add HAVE_DYNAMIC_FTRACE_WITH_REGS support LoongArch/ftrace: Add dynamic function graph tracer support LoongArch/ftrace: Add dynamic function tracer support LoongArch/ftrace: Add recordmcount support LoongArch/ftrace: Add basic support LoongArch: module: Use got/plt section indices for relocations LoongArch: Add basic STACKPROTECTOR support LoongArch: Add hibernation (ACPI S4) support LoongArch: Add suspend (ACPI S3) support LoongArch: Add processing ISA Node in DeviceTree LoongArch: Add FDT booting support from efi system table LoongArch: Use alternative to optimize libraries LoongArch: Add alternative runtime patching mechanism LoongArch: Add unaligned access support LoongArch: BPF: Add BPF exception tables LoongArch: Remove the .fixup section usage ...
| * LoongArch: Add basic STACKPROTECTOR supportHuacai Chen2022-12-141-0/+6
| | | | | | | | | | | | | | | | Add basic stack protector support similar to other architectures. A constant canary value is set at boot time, and with help of compiler's -fstack-protector we can detect stack corruption. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* | Merge tag 'random-6.2-rc1-for-linus' of ↵Linus Torvalds2022-12-121-1/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: - Replace prandom_u32_max() and various open-coded variants of it, there is now a new family of functions that uses fast rejection sampling to choose properly uniformly random numbers within an interval: get_random_u32_below(ceil) - [0, ceil) get_random_u32_above(floor) - (floor, U32_MAX] get_random_u32_inclusive(floor, ceil) - [floor, ceil] Coccinelle was used to convert all current users of prandom_u32_max(), as well as many open-coded patterns, resulting in improvements throughout the tree. I'll have a "late" 6.1-rc1 pull for you that removes the now unused prandom_u32_max() function, just in case any other trees add a new use case of it that needs to converted. According to linux-next, there may be two trivial cases of prandom_u32_max() reintroductions that are fixable with a 's/.../.../'. So I'll have for you a final conversion patch doing that alongside the removal patch during the second week. This is a treewide change that touches many files throughout. - More consistent use of get_random_canary(). - Updates to comments, documentation, tests, headers, and simplification in configuration. - The arch_get_random*_early() abstraction was only used by arm64 and wasn't entirely useful, so this has been replaced by code that works in all relevant contexts. - The kernel will use and manage random seeds in non-volatile EFI variables, refreshing a variable with a fresh seed when the RNG is initialized. The RNG GUID namespace is then hidden from efivarfs to prevent accidental leakage. These changes are split into random.c infrastructure code used in the EFI subsystem, in this pull request, and related support inside of EFISTUB, in Ard's EFI tree. These are co-dependent for full functionality, but the order of merging doesn't matter. - Part of the infrastructure added for the EFI support is also used for an improvement to the way vsprintf initializes its siphash key, replacing an sleep loop wart. - The hardware RNG framework now always calls its correct random.c input function, add_hwgenerator_randomness(), rather than sometimes going through helpers better suited for other cases. - The add_latent_entropy() function has long been called from the fork handler, but is a no-op when the latent entropy gcc plugin isn't used, which is fine for the purposes of latent entropy. But it was missing out on the cycle counter that was also being mixed in beside the latent entropy variable. So now, if the latent entropy gcc plugin isn't enabled, add_latent_entropy() will expand to a call to add_device_randomness(NULL, 0), which adds a cycle counter, without the absent latent entropy variable. - The RNG is now reseeded from a delayed worker, rather than on demand when used. Always running from a worker allows it to make use of the CPU RNG on platforms like S390x, whose instructions are too slow to do so from interrupts. It also has the effect of adding in new inputs more frequently with more regularity, amounting to a long term transcript of random values. Plus, it helps a bit with the upcoming vDSO implementation (which isn't yet ready for 6.2). - The jitter entropy algorithm now tries to execute on many different CPUs, round-robining, in hopes of hitting even more memory latencies and other unpredictable effects. It also will mix in a cycle counter when the entropy timer fires, in addition to being mixed in from the main loop, to account more explicitly for fluctuations in that timer firing. And the state it touches is now kept within the same cache line, so that it's assured that the different execution contexts will cause latencies. * tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits) random: include <linux/once.h> in the right header random: align entropy_timer_state to cache line random: mix in cycle counter when jitter timer fires random: spread out jitter callback to different CPUs random: remove extraneous period and add a missing one in comments efi: random: refresh non-volatile random seed when RNG is initialized vsprintf: initialize siphash key using notifier random: add back async readiness notifier random: reseed in delayed work rather than on-demand random: always mix cycle counter in add_latent_entropy() hw_random: use add_hwgenerator_randomness() for early entropy random: modernize documentation comment on get_random_bytes() random: adjust comment to account for removed function random: remove early archrandom abstraction random: use random.trust_{bootloader,cpu} command line option only stackprotector: actually use get_random_canary() stackprotector: move get_random_canary() into stackprotector.h treewide: use get_random_u32_inclusive() when possible treewide: use get_random_u32_{above,below}() instead of manual loop treewide: use get_random_u32_below() instead of deprecated function ...
| * treewide: use get_random_u32_below() instead of deprecated functionJason A. Donenfeld2022-11-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a simple mechanical transformation done by: @@ expression E; @@ - prandom_u32_max + get_random_u32_below (E) Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Reviewed-by: SeongJae Park <sj@kernel.org> # for damon Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* | LoongArch: Clear FPU/SIMD thread info flags for kernel threadHuacai Chen2022-11-211-4/+5
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a kernel thread is created by a user thread, it may carry FPU/SIMD thread info flags (TIF_USEDFPU, TIF_USEDSIMD, etc.). Then it will be considered as a fpu owner and kernel try to save its FPU/SIMD context and cause such errors: [ 41.518931] do_fpu invoked from kernel context![#1]: [ 41.523933] CPU: 1 PID: 395 Comm: iou-wrk-394 Not tainted 6.1.0-rc5+ #217 [ 41.530757] Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.pre-beta8 08/18/2022 [ 41.544064] $ 0 : 0000000000000000 90000000011e9468 9000000106c7c000 9000000106c7fcf0 [ 41.552101] $ 4 : 9000000106305d40 9000000106689800 9000000106c7fd08 0000000003995818 [ 41.560138] $ 8 : 0000000000000001 90000000009a72e4 0000000000000020 fffffffffffffffc [ 41.568174] $12 : 0000000000000000 0000000000000000 0000000000000020 00000009aab7e130 [ 41.576211] $16 : 00000000000001ff 0000000000000407 0000000000000001 0000000000000000 [ 41.584247] $20 : 0000000000000000 0000000000000001 9000000106c7fd70 90000001002f0400 [ 41.592284] $24 : 0000000000000000 900000000178f740 90000000011e9834 90000001063057c0 [ 41.600320] $28 : 0000000000000000 0000000000000001 9000000006826b40 9000000106305140 [ 41.608356] era : 9000000000228848 _save_fp+0x0/0xd8 [ 41.613542] ra : 90000000011e9468 __schedule+0x568/0x8d0 [ 41.619160] CSR crmd: 000000b0 [ 41.619163] CSR prmd: 00000000 [ 41.622359] CSR euen: 00000000 [ 41.625558] CSR ecfg: 00071c1c [ 41.628756] CSR estat: 000f0000 [ 41.635239] ExcCode : f (SubCode 0) [ 41.638783] PrId : 0014c010 (Loongson-64bit) [ 41.643191] Modules linked in: acpi_ipmi vfat fat ipmi_si ipmi_devintf cfg80211 ipmi_msghandler rfkill fuse efivarfs [ 41.653734] Process iou-wrk-394 (pid: 395, threadinfo=0000000004ebe913, task=00000000636fa1be) [ 41.662375] Stack : 00000000ffff0875 9000000006800ec0 9000000006800ec0 90000000002d57e0 [ 41.670412] 0000000000000001 0000000000000000 9000000106535880 0000000000000001 [ 41.678450] 9000000105291800 0000000000000000 9000000105291838 900000000178e000 [ 41.686487] 9000000106c7fd90 9000000106305140 0000000000000001 90000000011e9834 [ 41.694523] 00000000ffff0875 90000000011f034c 9000000105291838 9000000105291830 [ 41.702561] 0000000000000000 9000000006801440 00000000ffff0875 90000000002d48c0 [ 41.710597] 9000000128800001 9000000106305140 9000000105291838 9000000105291838 [ 41.718634] 9000000105291830 9000000107811740 9000000105291848 90000000009bf1e0 [ 41.726672] 9000000105291830 9000000107811748 2d6b72772d756f69 0000000000343933 [ 41.734708] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 41.742745] ... [ 41.745252] Call Trace: [ 42.197868] [<9000000000228848>] _save_fp+0x0/0xd8 [ 42.205214] [<90000000011ed468>] __schedule+0x568/0x8d0 [ 42.210485] [<90000000011ed834>] schedule+0x64/0xd4 [ 42.215411] [<90000000011f434c>] schedule_timeout+0x88/0x188 [ 42.221115] [<90000000009c36d0>] io_wqe_worker+0x184/0x350 [ 42.226645] [<9000000000221cf0>] ret_from_kernel_thread+0xc/0x9c This can be easily triggered by ltp testcase syscalls/io_uring02 and it can also be easily fixed by clearing the FPU/SIMD thread info flags for kernel threads in copy_thread(). Cc: stable@vger.kernel.org Reported-by: Qi Hu <huqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* LoongArch: Remove unused kernel stack paddingJinyang He2022-10-291-2/+2
| | | | | | | | | | The current LoongArch kernel stack is padded as if obeying the MIPS o32 calling convention (32 bytes), signifying the port's MIPS lineage but no longer making sense. Remove the padding for clarity. Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* treewide: use prandom_u32_max() when possible, part 1Jason A. Donenfeld2022-10-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than incurring a division or requesting too many random bytes for the given range, use the prandom_u32_max() function, which only takes the minimum required bytes from the RNG and avoids divisions. This was done mechanically with this coccinelle script: @basic@ expression E; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u64; @@ ( - ((T)get_random_u32() % (E)) + prandom_u32_max(E) | - ((T)get_random_u32() & ((E) - 1)) + prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2) | - ((u64)(E) * get_random_u32() >> 32) + prandom_u32_max(E) | - ((T)get_random_u32() & ~PAGE_MASK) + prandom_u32_max(PAGE_SIZE) ) @multi_line@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; identifier RAND; expression E; @@ - RAND = get_random_u32(); ... when != RAND - RAND %= (E); + RAND = prandom_u32_max(E); // Find a potential literal @literal_mask@ expression LITERAL; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; position p; @@ ((T)get_random_u32()@p & (LITERAL)) // Add one to the literal. @script:python add_one@ literal << literal_mask.LITERAL; RESULT; @@ value = None if literal.startswith('0x'): value = int(literal, 16) elif literal[0] in '123456789': value = int(literal, 10) if value is None: print("I don't know how to handle %s" % (literal)) cocci.include_match(False) elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1: print("Skipping 0x%x for cleanup elsewhere" % (value)) cocci.include_match(False) elif value & (value + 1) != 0: print("Skipping 0x%x because it's not a power of two minus one" % (value)) cocci.include_match(False) elif literal.startswith('0x'): coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1)) else: coccinelle.RESULT = cocci.make_expr("%d" % (value + 1)) // Replace the literal mask with the calculated result. @plus_one@ expression literal_mask.LITERAL; position literal_mask.p; expression add_one.RESULT; identifier FUNC; @@ - (FUNC()@p & (LITERAL)) + prandom_u32_max(RESULT) @collapse_ret@ type T; identifier VAR; expression E; @@ { - T VAR; - VAR = (E); - return VAR; + return E; } @drop_var@ type T; identifier VAR; @@ { - T VAR; ... when != VAR } Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: KP Singh <kpsingh@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* LoongArch: Add STACKTRACE supportQing Zhang2022-08-121-1/+28
| | | | | | | | | | | | | | 1. Use common arch_stack_walk() infrastructure to avoid duplicated code and avoid taking care of the stack storage and filtering. 2. Add sched_ra (means sched return address) and sched_cfa (means sched call frame address) to thread_info, and store them in switch_to(). 3. Add __get_wchan() implementation. Now we can print the process stack and wait channel by cat /proc/*/stack and /proc/*/wchan. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* LoongArch: Add guess unwinder supportQing Zhang2022-08-121-0/+61
| | | | | | | | | | | | | | | | | | | | Name "guess unwinder" comes from x86, it scans the stack and reports every kernel text address it finds. Unwinders can be used by dump_stack() and other stacktrace functions. Three stages when we do unwind, 1) unwind_start(), the prapare of unwinding, fill unwind_state. 2) unwind_done(), judge whether the unwind process is finished or not. 3) unwind_next_frame(), unwind the next frame. Add get_stack_info() to get stack info. At present we have irq stack and task stack. The next_sp is the key info between two types of stacks. Dividing unwinder helps to add new unwinders in the future. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* LoongArch: Fix copy_thread() build errorsHuacai Chen2022-06-081-6/+8
| | | | | | | | | | | Commit c5febea0956fd387 ("fork: Pass struct kernel_clone_args into copy_thread") change the prototype of copy_thread(), while commit 5bd2e97c868a8a44 ("fork: Generalize PF_IO_WORKER handling") change the structure of kernel_clone_args. They cause build errors, so fix it. Fixes: 5bd2e97c868a8a44 ("fork: Generalize PF_IO_WORKER handling") Fixes: c5febea0956fd387 ("fork: Pass struct kernel_clone_args into copy_thread") Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* LoongArch: Add multi-processor (SMP) supportHuacai Chen2022-06-031-0/+7
| | | | | | | | | LoongArch-based procesors have 4, 8 or 16 cores per package. This patch adds multi-processor (SMP) support for LoongArch. Reviewed-by: WANG Xuerui <git@xen0n.name> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
* LoongArch: Add process managementHuacai Chen2022-06-031-0/+260
Add process management support for LoongArch, including: thread info definition, context switch and process tracing. Reviewed-by: WANG Xuerui <git@xen0n.name> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>