linux-stable.git - Linux kernel stable tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	x86, prctl: Hook L1D flushing in via prctl	Balbir Singh	2021-07-28	2	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the existing PR_GET/SET_SPECULATION_CTRL API to expose the L1D flush capability. For L1D flushing PR_SPEC_FORCE_DISABLE and PR_SPEC_DISABLE_NOEXEC are not supported. Enabling L1D flush does not check if the task is running on an SMT enabled core, rather a check is done at runtime (at the time of flush), if the task runs on a SMT sibling then the task is sent a SIGBUS which is executed before the task returns to user space or to a guest. This is better than the other alternatives of: a. Ensuring strict affinity of the task (hard to enforce without further changes in the scheduler) b. Silently skipping flush for tasks that move to SMT enabled cores. Hook up the core prctl and implement the x86 specific parts which in turn makes it functional. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210108121056.21940-5-sblbir@amazon.com
*	x86/mm: Prepare for opt-in based L1D flush in switch_mm()	Balbir Singh	2021-07-28	5	-2/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The goal of this is to allow tasks that want to protect sensitive information, against e.g. the recently found snoop assisted data sampling vulnerabilites, to flush their L1D on being switched out. This protects their data from being snooped or leaked via side channels after the task has context switched out. This could also be used to wipe L1D when an untrusted task is switched in, but that's not a really well defined scenario while the opt-in variant is clearly defined. The mechanism is default disabled and can be enabled on the kernel command line. Prepare for the actual prctl based opt-in: 1) Provide the necessary setup functionality similar to the other mitigations and enable the static branch when the command line option is set and the CPU provides support for hardware assisted L1D flushing. Software based L1D flush is not supported because it's CPU model specific and not really well defined. This does not come with a sysfs file like the other mitigations because it is not bound to any specific vulnerability. Support has to be queried via the prctl(2) interface. 2) Add TIF_SPEC_L1D_FLUSH next to L1D_SPEC_IB so the two bits can be mangled into the mm pointer in one go which allows to reuse the existing mechanism in switch_mm() for the conditional IBPB speculation barrier efficiently. 3) Add the L1D flush specific functionality which flushes L1D when the outgoing task opted in. Also check whether the incoming task has requested L1D flush and if so validate that it is not accidentaly running on an SMT sibling as this makes the whole excercise moot because SMT siblings share L1D which opens tons of other attack vectors. If that happens schedule task work which signals the incoming task on return to user/guest with SIGBUS as this is part of the paranoid L1D flush contract. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210108121056.21940-1-sblbir@amazon.com
*	x86/process: Make room for TIF_SPEC_L1D_FLUSH	Balbir Singh	2021-07-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	The upcoming support for paranoid L1D flush in switch_mm() requires that TIF_SPEC_IB and the new TIF_SPEC_L1D_FLUSH are two consecutive bits in thread_info::flags. Move TIF_SPEC_FORCE_UPDATE to a spare bit to make room for the new one. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210108121056.21940-1-sblbir@amazon.com
*	sched: Add task_work callback for paranoid L1D flush	Balbir Singh	2021-07-28	2	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The upcoming paranoid L1D flush infrastructure allows to conditionally (opt-in) flush L1D in switch_mm() as a defense against potential new side channels or for paranoia reasons. As the flush makes only sense when a task runs on a non-SMT enabled core, because SMT siblings share L1, the switch_mm() logic will kill a task which is flagged for L1D flush when it is running on a SMT thread. Add a taskwork callback so switch_mm() can queue a SIG_KILL command which is invoked when the task tries to return to user space. Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210108121056.21940-1-sblbir@amazon.com
*	x86/mm: Refactor cond_ibpb() to support other use cases	Balbir Singh	2021-07-28	2	-25/+30
\| \| \| \| \| \| \| \| \| \| \|	cond_ibpb() has the necessary bits required to track the previous mm in switch_mm_irqs_off(). This can be reused for other use cases like L1D flushing on context switch. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210108121056.21940-3-sblbir@amazon.com
*	x86/smp: Add a per-cpu view of SMT state	Balbir Singh	2021-07-28	2	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A new field smt_active in cpuinfo_x86 identifies if the current core/cpu is in SMT mode or not. This is helpful when the system has some of its cores with threads offlined and can be used for cases where action is taken based on the state of SMT. The upcoming support for paranoid L1D flush will make use of this information. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210108121056.21940-2-sblbir@amazon.com
*	Linux 5.14-rc3v5.14-rc3	Linus Torvalds	2021-07-25	1	-1/+1
\|
*	smpboot: fix duplicate and misplaced inlining directive	Linus Torvalds	2021-07-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gcc doesn't care, but clang quite reasonably pointed out that the recent commit e9ba16e68cce ("smpboot: Mark idle_init() as __always_inlined to work around aggressive compiler un-inlining") did some really odd things: kernel/smpboot.c:50:20: warning: duplicate 'inline' declaration specifier [-Wduplicate-decl-specifier] static inline void __always_inline idle_init(unsigned int cpu) ^ which not only has that duplicate inlining specifier, but the new __always_inline was put in the wrong place of the function definition. We put the storage class specifiers (ie things like "static" and "extern") first, and the type information after that. And while the compiler may not care, we put the inline specifier before the types. So it should be just static __always_inline void idle_init(unsigned int cpu) instead. Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge tag 'powerpc-5.14-3' of ↵	Linus Torvalds	2021-07-25	5	-8/+68
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix guest to host memory corruption in H_RTAS due to missing nargs check. - Fix guest triggerable host crashes due to bad handling of nested guest TM state. - Fix possible crashes due to incorrect reference counting in kvm_arch_vcpu_ioctl(). - Two commits fixing some regressions in KVM transactional memory handling introduced by the recent rework of the KVM code. Thanks to Nicholas Piggin, Alexey Kardashevskiy, and Michael Neuling. * tag 'powerpc-5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: KVM: PPC: Book3S HV Nested: Sanitise H_ENTER_NESTED TM state KVM: PPC: Book3S: Fix H_RTAS rets buffer overflow KVM: PPC: Fix kvm_arch_vcpu_ioctl vcpu_load leak KVM: PPC: Book3S: Fix CONFIG_TRANSACTIONAL_MEM=n crash KVM: PPC: Book3S HV P9: Fix guest TM support
\| *	KVM: PPC: Book3S HV Nested: Sanitise H_ENTER_NESTED TM state	Nicholas Piggin	2021-07-23	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The H_ENTER_NESTED hypercall is handled by the L0, and it is a request by the L1 to switch the context of the vCPU over to that of its L2 guest, and return with an interrupt indication. The L1 is responsible for switching some registers to guest context, and the L0 switches others (including all the hypervisor privileged state). If the L2 MSR has TM active, then the L1 is responsible for recheckpointing the L2 TM state. Then the L1 exits to L0 via the H_ENTER_NESTED hcall, and the L0 saves the TM state as part of the exit, and then it recheckpoints the TM state as part of the nested entry and finally HRFIDs into the L2 with TM active MSR. Not efficient, but about the simplest approach for something that's horrendously complicated. Problems arise if the L1 exits to the L0 with a TM state which does not match the L2 TM state being requested. For example if the L1 is transactional but the L2 MSR is non-transactional, or vice versa. The L0's HRFID can take a TM Bad Thing interrupt and crash. Fix this by disallowing H_ENTER_NESTED in TM[T] state entirely, and then ensuring that if the L1 is suspended then the L2 must have TM active, and if the L1 is not suspended then the L2 must not have TM active. Fixes: 360cae313702 ("KVM: PPC: Book3S HV: Nested guest entry via hypercall") Cc: stable@vger.kernel.org # v4.20+ Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
\| *	KVM: PPC: Book3S: Fix H_RTAS rets buffer overflow	Nicholas Piggin	2021-07-23	1	-3/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The kvmppc_rtas_hcall() sets the host rtas_args.rets pointer based on the rtas_args.nargs that was provided by the guest. That guest nargs value is not range checked, so the guest can cause the host rets pointer to be pointed outside the args array. The individual rtas function handlers check the nargs and nrets values to ensure they are correct, but if they are not, the handlers store a -3 (0xfffffffd) failure indication in rets[0] which corrupts host memory. Fix this by testing up front whether the guest supplied nargs and nret would exceed the array size, and fail the hcall directly without storing a failure indication to rets[0]. Also expand on a comment about why we kill the guest and try not to return errors directly if we have a valid rets[0] pointer. Fixes: 8e591cb72047 ("KVM: PPC: Book3S: Add infrastructure to implement kernel-side RTAS calls") Cc: stable@vger.kernel.org # v3.10+ Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
\| *	KVM: PPC: Fix kvm_arch_vcpu_ioctl vcpu_load leak	Nicholas Piggin	2021-07-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vcpu_put is not called if the user copy fails. This can result in preempt notifier corruption and crashes, among other issues. Fixes: b3cebfe8c1ca ("KVM: PPC: Move vcpu_load/vcpu_put down to each ioctl case in kvm_arch_vcpu_ioctl") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210716024310.164448-2-npiggin@gmail.com
\| *	KVM: PPC: Book3S: Fix CONFIG_TRANSACTIONAL_MEM=n crash	Nicholas Piggin	2021-07-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When running CPU_FTR_P9_TM_HV_ASSIST, HFSCR[TM] is set for the guest even if the host has CONFIG_TRANSACTIONAL_MEM=n, which causes it to be unprepared to handle guest exits while transactional. Normal guests don't have a problem because the HTM capability will not be advertised, but a rogue or buggy one could crash the host. Fixes: 4bb3c7a0208f ("KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210716024310.164448-1-npiggin@gmail.com
\| *	KVM: PPC: Book3S HV P9: Fix guest TM support	Nicholas Piggin	2021-07-15	1	-3/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The conversion to C introduced several bugs in TM handling that can cause host crashes with TM bad thing interrupts. Mostly just simple typos or missed logic in the conversion that got through due to my not testing TM in the guest sufficiently. - Early TM emulation for the softpatch interrupt should be done if fake suspend mode is _not_ active. - Early TM emulation wants to return immediately to the guest so as to not doom transactions unnecessarily. - And if exiting from the guest, the host MSR should include the TM[S] bit if the guest was T/S, before it is treclaimed. After this fix, all the TM selftests pass when running on a P9 processor that implements TM with softpatch interrupt. Fixes: 89d35b2391015 ("KVM: PPC: Book3S HV P9: Implement the rest of the P9 path in C") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210712013650.376325-1-npiggin@gmail.com
* \|	Merge tag 'timers-urgent-2021-07-25' of ↵	Linus Torvalds	2021-07-25	2	-8/+10
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A small set of timer related fixes: - Plug a race between rearm and process tick in the posix CPU timers code - Make the optimization to avoid recalculation of the next timer interrupt work correctly when there are no timers pending" * tag 'timers-urgent-2021-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers: Fix get_next_timer_interrupt() with no timers pending posix-cpu-timers: Fix rearm racing against process tick
\| * \	Merge branch 'timers/urgent' of ↵	Thomas Gleixner	2021-07-20	2	-8/+10
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/urgent Pull dyntick fixes from Frederic Weisbecker: - Fix a rearm race in the posix cpu timer code - Handle get_next_timer_interrupt() correctly when no timers are pending Link: https://lore.kernel.org/r/20210715104218.81276-1-frederic@kernel.org
\| \| * \|	timers: Fix get_next_timer_interrupt() with no timers pending	Nicolas Saenz Julienne	2021-07-15	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	31cd0e119d50 ("timers: Recalculate next timer interrupt only when necessary") subtly altered get_next_timer_interrupt()'s behaviour. The function no longer consistently returns KTIME_MAX with no timers pending. In order to decide if there are any timers pending we check whether the next expiry will happen NEXT_TIMER_MAX_DELTA jiffies from now. Unfortunately, the next expiry time and the timer base clock are no longer updated in unison. The former changes upon certain timer operations (enqueue, expire, detach), whereas the latter keeps track of jiffies as they move forward. Ultimately breaking the logic above. A simplified example: - Upon entering get_next_timer_interrupt() with: jiffies = 1 base->clk = 0; base->next_expiry = NEXT_TIMER_MAX_DELTA; 'base->next_expiry == base->clk + NEXT_TIMER_MAX_DELTA', the function returns KTIME_MAX. - 'base->clk' is updated to the jiffies value. - The next time we enter get_next_timer_interrupt(), taking into account no timer operations happened: base->clk = 1; base->next_expiry = NEXT_TIMER_MAX_DELTA; 'base->next_expiry != base->clk + NEXT_TIMER_MAX_DELTA', the function returns a valid expire time, which is incorrect. This ultimately might unnecessarily rearm sched's timer on nohz_full setups, and add latency to the system[1]. So, introduce 'base->timers_pending'[2], update it every time 'base->next_expiry' changes, and use it in get_next_timer_interrupt(). [1] See tick_nohz_stop_tick(). [2] A quick pahole check on x86_64 and arm64 shows it doesn't make 'struct timer_base' any bigger. Fixes: 31cd0e119d50 ("timers: Recalculate next timer interrupt only when necessary") Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
\| \| * \|	posix-cpu-timers: Fix rearm racing against process tick	Frederic Weisbecker	2021-07-15	1	-5/+5
\| \| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the process wide cputime counter is started locklessly from posix_cpu_timer_rearm(), it can be concurrently stopped by operations on other timers from the same thread group, such as in the following unlucky scenario: CPU 0 CPU 1 ----- ----- timer_settime(TIMER B) posix_cpu_timer_rearm(TIMER A) cpu_clock_sample_group() (pct->timers_active already true) handle_posix_cpu_timers() check_process_timers() stop_process_timers() pct->timers_active = false arm_timer(TIMER A) tick -> run_posix_cpu_timers() // sees !pct->timers_active, ignore // our TIMER A Fix this with simply locking process wide cputime counting start and timer arm in the same block. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Fixes: 60f2ceaa8111 ("posix-cpu-timers: Remove unnecessary locking around cpu_clock_sample_group") Cc: stable@vger.kernel.org Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com>
* \| \|	Merge tag 'locking-urgent-2021-07-25' of ↵	Linus Torvalds	2021-07-25	1	-3/+4
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 jump label fix from Thomas Gleixner: "A single fix for jump labels to prevent the compiler from agressive un-inlining which results in a section mismatch" * tag 'locking-urgent-2021-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: jump_labels: Mark __jump_label_transform() as __always_inlined to work around aggressive compiler un-inlining
\| * \| \|	jump_labels: Mark __jump_label_transform() as __always_inlined to work ↵	Ingo Molnar	2021-07-13	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	around aggressive compiler un-inlining In randconfig testing, certain UBSAN and CC Kconfig combinations with GCC 10.3.0: CONFIG_X86_32=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_UBSAN=y # CONFIG_UBSAN_TRAP is not set # CONFIG_UBSAN_BOUNDS is not set CONFIG_UBSAN_SHIFT=y # CONFIG_UBSAN_DIV_ZERO is not set CONFIG_UBSAN_UNREACHABLE=y CONFIG_UBSAN_BOOL=y # CONFIG_UBSAN_ENUM is not set # CONFIG_UBSAN_ALIGNMENT is not set # CONFIG_UBSAN_SANITIZE_ALL is not set ... produce this build warning (and build error if CONFIG_SECTION_MISMATCH_WARN_ONLY=y is set): WARNING: modpost: vmlinux.o(.text+0x4c1cc): Section mismatch in reference from the function __jump_label_transform() to the function .init.text:text_poke_early() The function __jump_label_transform() references the function __init text_poke_early(). This is often because __jump_label_transform lacks a __init annotation or the annotation of text_poke_early is wrong. ERROR: modpost: Section mismatches detected. The problem is that __jump_label_transform() gets uninlined by GCC, despite there being only a single local scope user of the 'static inline' function. Mark the function __always_inline instead, to work around this compiler bug/artifact. Signed-off-by: Ingo Molnar <mingo@kernel.org>
* \| \| \|	Merge tag 'efi-urgent-2021-07-25' of ↵	Linus Torvalds	2021-07-25	4	-7/+23
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Thomas Gleixner: "A set of EFI fixes: - Prevent memblock and I/O reserved resources to get out of sync when EFI memreserve is in use. - Don't claim a non-existing table is invalid - Don't warn when firmware memory is already reserved correctly" * tag 'efi-urgent-2021-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/mokvar: Reserve the table only if it is in boot services data efi/libstub: Fix the efi_load_initrd function description firmware/efi: Tell memblock about EFI iomem reservations efi/tpm: Differentiate missing and invalid final event log table.
\| * \ \ \	Merge tag 'efi-urgent-for-v5.14-rc2' of ↵	Ingo Molnar	2021-07-20	4	-7/+23
\| \|\ \ \ \ \| \| \|_\|/ / \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/urgent Pull EFI fixes for v5.14-rc2 from Ard Biesheuvel: " - Ensure that memblock reservations and IO reserved resources remain in sync when using the EFI memreserve feature. - Don't complain about invalid TPM final event log table if it is missing altogether. - Comment header fix for the stub. - Avoid a spurious warning when attempting to reserve firmware memory that is already reserved in the first place." Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| \| * \| \|	efi/mokvar: Reserve the table only if it is in boot services data	Borislav Petkov	2021-07-20	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One of the SUSE QA tests triggered: localhost kernel: efi: Failed to lookup EFI memory descriptor for 0x000000003dcf8000 which comes from x86's version of efi_arch_mem_reserve() trying to reserve a memory region. Usually, that function expects EFI_BOOT_SERVICES_DATA memory descriptors but the above case is for the MOKvar table which is allocated in the EFI shim as runtime services. That lead to a fix changing the allocation of that table to boot services. However, that fix broke booting SEV guests with that shim leading to this kernel fix 8d651ee9c71b ("x86/ioremap: Map EFI-reserved memory as encrypted for SEV") which extended the ioremap hint to map reserved EFI boot services as decrypted too. However, all that wasn't needed, IMO, because that error message in efi_arch_mem_reserve() was innocuous in this case - if the MOKvar table is not in boot services, then it doesn't need to be reserved in the first place because it is, well, in runtime services which should be reserved anyway. So do that reservation for the MOKvar table only if it is allocated in boot services data. I couldn't find any requirement about where that table should be allocated in, unlike the ESRT which allocation is mandated to be done in boot services data by the UEFI spec. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
\| \| * \| \|	efi/libstub: Fix the efi_load_initrd function description	Atish Patra	2021-07-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The soft_limit and hard_limit in the function efi_load_initrd describes the preferred and max address of initrd loading location respectively. However, the description wrongly describes it as the size of the allocated memory. Fix the function description. Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
\| \| * \| \|	firmware/efi: Tell memblock about EFI iomem reservations	Marc Zyngier	2021-07-16	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	kexec_load_file() relies on the memblock infrastructure to avoid stamping over regions of memory that are essential to the survival of the system. However, nobody seems to agree how to flag these regions as reserved, and (for example) EFI only publishes its reservations in /proc/iomem for the benefit of the traditional, userspace based kexec tool. On arm64 platforms with GICv3, this can result in the payload being placed at the location of the LPI tables. Shock, horror! Let's augment the EFI reservation code with a memblock_reserve() call, protecting our dear tables from the secondary kernel invasion. Reported-by: Moritz Fischer <mdf@kernel.org> Tested-by: Moritz Fischer <mdf@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Cc: Ard Biesheuvel <ardb@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
\| \| * \| \|	efi/tpm: Differentiate missing and invalid final event log table.	Michal Suchanek	2021-07-16	1	-3/+5
\| \| \| \|/ \| \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Missing TPM final event log table is not a firmware bug. Clearly if providing event log in the old format makes the final event log invalid it should not be provided at least in that case. Fixes: b4f1874c6216 ("tpm: check event log version before reading final events") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
* \| \| \|	Merge tag 'core-urgent-2021-07-25' of ↵	Linus Torvalds	2021-07-25	1	-1/+1
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core fix from Thomas Gleixner: "A single update for the boot code to prevent aggressive un-inlining which causes a section mismatch" * tag 'core-urgent-2021-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smpboot: Mark idle_init() as __always_inlined to work around aggressive compiler un-inlining
\| * \| \| \|	smpboot: Mark idle_init() as __always_inlined to work around aggressive ↵	Ingo Molnar	2021-07-13	1	-1/+1
\| \| \|_\|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	compiler un-inlining While this function is a static inline, and is only used once in local scope, certain Kconfig variations may cause it to be compiled as a standalone function: 89231bf0 <idle_init>: 89231bf0: 83 05 60 d9 45 89 01 addl $0x1,0x8945d960 89231bf7: 55 push %ebp Resulting in this build failure: WARNING: modpost: vmlinux.o(.text.unlikely+0x7fd5): Section mismatch in reference from the function idle_init() to the function .init.text:fork_idle() The function idle_init() references the function __init fork_idle(). This is often because idle_init lacks a __init annotation or the annotation of fork_idle is wrong. ERROR: modpost: Section mismatches detected. Certain USBSAN options x86-32 builds with CONFIG_CC_OPTIMIZE_FOR_SIZE=y seem to be causing this. So mark idle_init() as __always_inline to work around this compiler bug/feature. Signed-off-by: Ingo Molnar <mingo@kernel.org>
* \| \| \|	Merge tag 'dma-mapping-5.14-1' of git://git.infradead.org/users/hch/dma-mapping	Linus Torvalds	2021-07-25	1	-2/+10
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull dma-mapping fix from Christoph Hellwig: - handle vmalloc addresses in dma_common_{mmap,get_sgtable} (Roman Skakun) * tag 'dma-mapping-5.14-1' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: handle vmalloc addresses in dma_common_{mmap,get_sgtable}
\| * \| \| \|	dma-mapping: handle vmalloc addresses in dma_common_{mmap,get_sgtable}	Roman Skakun	2021-07-16	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	xen-swiotlb can use vmalloc backed addresses for dma coherent allocations and uses the common helpers. Properly handle them to unbreak Xen on ARM platforms. Fixes: 1b65c4e5a9af ("swiotlb-xen: use xen_alloc/free_coherent_pages") Signed-off-by: Roman Skakun <roman_skakun@epam.com> Reviewed-by: Andrii Anisov <andrii_anisov@epam.com> [hch: split the patch, renamed the helpers] Signed-off-by: Christoph Hellwig <hch@lst.de>
* \| \| \| \|	Merge tag '5.14-rc2-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6	Linus Torvalds	2021-07-24	6	-55/+247
\|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull cifs fixes from Steve French: "Five cifs/smb3 fixes, including a DFS failover fix, two fallocate fixes, and two trivial coverity cleanups" * tag '5.14-rc2-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix fallocate when trying to allocate a hole. CIFS: Clarify SMB1 code for POSIX delete file CIFS: Clarify SMB1 code for POSIX Create cifs: support share failover when remounting cifs: only write 64kb at a time when fallocating a small region of a file
\| * \| \| \| \|	cifs: fix fallocate when trying to allocate a hole.	Ronnie Sahlberg	2021-07-22	1	-5/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the conditional checking for out_data_len and skipping the fallocate if it is 0. This is wrong will actually change any legitimate the fallocate where the entire region is unallocated into a no-op. Additionally, before allocating the range, if FALLOC_FL_KEEP_SIZE is set then we need to clamp the length of the fallocate region as to not extend the size of the file. Fixes: 966a3cb7c7db ("cifs: improve fallocate emulation") Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
\| * \| \| \| \|	CIFS: Clarify SMB1 code for POSIX delete file	Steve French	2021-07-22	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Coverity also complains about the way we calculate the offset (starting from the address of a 4 byte array within the header structure rather than from the beginning of the struct plus 4 bytes) for SMB1 CIFSPOSIXDelFile. This changeset doesn't change the address but makes it slightly clearer. Addresses-Coverity: 711519 ("Out of bounds write") Signed-off-by: Steve French <stfrench@microsoft.com>
\| * \| \| \| \|	CIFS: Clarify SMB1 code for POSIX Create	Steve French	2021-07-22	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Coverity also complains about the way we calculate the offset (starting from the address of a 4 byte array within the header structure rather than from the beginning of the struct plus 4 bytes) for SMB1 CIFSPOSIXCreate. This changeset doesn't change the address but makes it slightly clearer. Addresses-Coverity: 711518 ("Out of bounds write") Signed-off-by: Steve French <stfrench@microsoft.com>
\| * \| \| \| \|	cifs: support share failover when remounting	Paulo Alcantara	2021-07-22	4	-40/+203
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When remouting a DFS share, force a new DFS referral of the path and if the currently cached targets do not match any of the new targets or there was no cached targets, then mark it for reconnect. For example: $ mount //dom/dfs/link /mnt -o username=foo,password=bar $ ls /mnt oldfile.txt change target share of 'link' in server settings $ mount /mnt -o remount,username=foo,password=bar $ ls /mnt newfile.txt Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
\| * \| \| \| \|	cifs: only write 64kb at a time when fallocating a small region of a file	Ronnie Sahlberg	2021-07-22	1	-7/+19
\| \| \|_\|/ / \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We only allow sending single credit writes through the SMB2_write() synchronous api so split this into smaller chunks. Fixes: 966a3cb7c7db ("cifs: improve fallocate emulation") Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reported-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* \| \| \| \|	Merge tag 'riscv-for-linus-5.14-rc3' of ↵	Linus Torvalds	2021-07-24	4	-21/+48
\|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - properly set the memory size, which fixes 32-bit systems - allow initrd to load anywhere in memory, rather that restricting it to the first 256MiB - fix the 'mem=' parameter on 64-bit systems to properly account for the maximum supported memory now that the kernel is outside the linear map - avoid installing mappings into the last 4KiB of memory, which conflicts with error values - avoid the stack from being freed while it is being walked - a handful of fixes to the new copy to/from user routines * tag 'riscv-for-linus-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: __asm_copy_to-from_user: Fix: Typos in comments riscv: __asm_copy_to-from_user: Remove unnecessary size check riscv: __asm_copy_to-from_user: Fix: fail on RV32 riscv: __asm_copy_to-from_user: Fix: overrun copy riscv: stacktrace: pin the task's stack in get_wchan riscv: Make sure the kernel mapping does not overlap with IS_ERR_VALUE riscv: Make sure the linear mapping does not use the kernel mapping riscv: Fix memory_limit for 64-bit kernel RISC-V: load initrd wherever it fits into memory riscv: Fix 32-bit RISC-V boot failure
\| * \| \| \| \|	riscv: __asm_copy_to-from_user: Fix: Typos in comments	Akira Tsukamoto	2021-07-23	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixing typos and grammar mistakes and using more intuitive label name. Signed-off-by: Akira Tsukamoto <akira.tsukamoto@gmail.com> Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
\| * \| \| \| \|	riscv: __asm_copy_to-from_user: Remove unnecessary size check	Akira Tsukamoto	2021-07-23	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Clean up: The size of 0 will be evaluated in the next step. Not required here. Signed-off-by: Akira Tsukamoto <akira.tsukamoto@gmail.com> Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
\| * \| \| \| \|	riscv: __asm_copy_to-from_user: Fix: fail on RV32	Akira Tsukamoto	2021-07-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Had a bug when converting bytes to bits when the cpu was rv32. The a3 contains the number of bytes and multiple of 8 would be the bits. The LGREG is holding 2 for RV32 and 3 for RV32, so to achieve multiple of 8 it must always be constant 3. The 2 was mistakenly used for rv32. Signed-off-by: Akira Tsukamoto <akira.tsukamoto@gmail.com> Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
\| * \| \| \| \|	riscv: __asm_copy_to-from_user: Fix: overrun copy	Akira Tsukamoto	2021-07-23	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There were two causes for the overrun memory access. The threshold size was too small. The aligning dst require one SZREG and unrolling word copy requires 8SZREG, total have to be at least 9SZREG. Inside the unrolling copy, the subtracting -(8SZREG-1) would make iteration happening one extra loop. Proper value is -(8SZREG). Signed-off-by: Akira Tsukamoto <akira.tsukamoto@gmail.com> Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
\| * \| \| \| \|	riscv: stacktrace: pin the task's stack in get_wchan	Jisheng Zhang	2021-07-23	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pin the task's stack before calling walk_stackframe() in get_wchan(). This can fix the panic as reported by Andreas when CONFIG_VMAP_STACK=y: [ 65.609696] Unable to handle kernel paging request at virtual address ffffffd0003bbde8 [ 65.610460] Oops [#1] [ 65.610626] Modules linked in: virtio_blk virtio_mmio rtc_goldfish btrfs blake2b_generic libcrc32c xor raid6_pq sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua efivarfs [ 65.611670] CPU: 2 PID: 1 Comm: systemd Not tainted 5.14.0-rc1-1.g34fe32a-default #1 openSUSE Tumbleweed (unreleased) c62f7109153e5a0897ee58ba52393ad99b070fd2 [ 65.612334] Hardware name: riscv-virtio,qemu (DT) [ 65.613008] epc : get_wchan+0x5c/0x88 [ 65.613334] ra : get_wchan+0x42/0x88 [ 65.613625] epc : ffffffff800048a4 ra : ffffffff8000488a sp : ffffffd00021bb90 [ 65.614008] gp : ffffffff817709f8 tp : ffffffe07fe91b80 t0 : 00000000000001f8 [ 65.614411] t1 : 0000000000020000 t2 : 0000000000000000 s0 : ffffffd00021bbd0 [ 65.614818] s1 : ffffffd0003bbdf0 a0 : 0000000000000001 a1 : 0000000000000002 [ 65.615237] a2 : ffffffff81618008 a3 : 0000000000000000 a4 : 0000000000000000 [ 65.615637] a5 : ffffffd0003bc000 a6 : 0000000000000002 a7 : ffffffe27d370000 [ 65.616022] s2 : ffffffd0003bbd90 s3 : ffffffff8071a81e s4 : 0000000000003fff [ 65.616407] s5 : ffffffffffffc000 s6 : 0000000000000000 s7 : ffffffff81618008 [ 65.616845] s8 : 0000000000000001 s9 : 0000000180000040 s10: 0000000000000000 [ 65.617248] s11: 000000000000016b t3 : 000000ff00000000 t4 : 0c6aec92de5e3fd7 [ 65.617672] t5 : fff78f60608fcfff t6 : 0000000000000078 [ 65.618088] status: 0000000000000120 badaddr: ffffffd0003bbde8 cause: 000000000000000d [ 65.618621] [<ffffffff800048a4>] get_wchan+0x5c/0x88 [ 65.619008] [<ffffffff8022da88>] do_task_stat+0x7a2/0xa46 [ 65.619325] [<ffffffff8022e87e>] proc_tgid_stat+0xe/0x16 [ 65.619637] [<ffffffff80227dd6>] proc_single_show+0x46/0x96 [ 65.619979] [<ffffffff801ccb1e>] seq_read_iter+0x190/0x31e [ 65.620341] [<ffffffff801ccd70>] seq_read+0xc4/0x104 [ 65.620633] [<ffffffff801a6bfe>] vfs_read+0x6a/0x112 [ 65.620922] [<ffffffff801a701c>] ksys_read+0x54/0xbe [ 65.621206] [<ffffffff801a7094>] sys_read+0xe/0x16 [ 65.621474] [<ffffffff8000303e>] ret_from_syscall+0x0/0x2 [ 65.622169] ---[ end trace f24856ed2b8789c5 ]--- [ 65.622832] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
\| * \| \| \| \|	riscv: Make sure the kernel mapping does not overlap with IS_ERR_VALUE	Alexandre Ghiti	2021-07-22	1	-2/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The check that is done in setup_bootmem currently only works for 32-bit kernel since the kernel mapping has been moved outside of the linear mapping for 64-bit kernel. So make sure that for 64-bit kernel, the kernel mapping does not overlap with the last 4K of the addressable memory. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Fixes: 2bfc6cd81bd1 ("riscv: Move kernel mapping outside of linear mapping") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
\| * \| \| \| \|	riscv: Make sure the linear mapping does not use the kernel mapping	Alexandre Ghiti	2021-07-22	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For 64-bit kernel, the end of the address space is occupied by the kernel mapping and currently, the functions to populate the kernel page tables (i.e. create_p*d_mapping) do not override existing mapping so we must make sure the linear mapping does not map memory in the kernel mapping by clipping the memory above the memory limit. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Fixes: c9811e379b21 ("riscv: Add mem kernel parameter support") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
\| * \| \| \| \|	riscv: Fix memory_limit for 64-bit kernel	Alexandre Ghiti	2021-07-22	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As described in Documentation/riscv/vm-layout.rst, the end of the virtual address space for 64-bit kernel is occupied by the modules/BPF/ kernel mappings so this actually reduces the amount of memory we are able to map and then use in the linear mapping. So make sure this limit is correctly set. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Fixes: 2bfc6cd81bd1 ("riscv: Move kernel mapping outside of linear mapping") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
\| * \| \| \| \|	RISC-V: load initrd wherever it fits into memory	Heinrich Schuchardt	2021-07-21	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Requiring that initrd is loaded below RAM start + 256 MiB led to failure to boot SUSE Linux with GRUB on QEMU, cf. https://lists.gnu.org/archive/html/grub-devel/2021-06/msg00037.html Remove the constraint. Reported-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Reviewed-by: Atish Patra <atish.patra@wdc.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Fixes: d7071743db31 ("RISC-V: Add EFI stub support.") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
\| * \| \| \| \|	Merge remote-tracking branch 'riscv/riscv-fix-32bit' into fixes	Palmer Dabbelt	2021-07-21	1	-0/+1
\| \|\ \ \ \ \ \| \| \|_\|_\|_\|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This contains a single fix for 32-bit boot. It happens this was already fixed by c9811e379b21 ("riscv: Add mem kernel parameter support"), but the bug existed before that feature addition so I've applied the patch earlier and then merged it in (which results in a conflict, which is fixed via not changing the resulting tree). * riscv/riscv-fix-32bit: riscv: Fix 32-bit RISC-V boot failure
\| \| * \| \| \|	riscv: Fix 32-bit RISC-V boot failure	Bin Meng	2021-07-21	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit dd2d082b5760 ("riscv: Cleanup setup_bootmem()") adjusted the calling sequence in setup_bootmem(), which invalidates the fix commit de043da0b9e7 ("RISC-V: Fix usage of memblock_enforce_memory_limit") did for 32-bit RISC-V unfortunately. So now 32-bit RISC-V does not boot again when testing booting kernel on QEMU 'virt' with '-m 2G', which was exactly what the original commit de043da0b9e7 ("RISC-V: Fix usage of memblock_enforce_memory_limit") tried to fix. Fixes: dd2d082b5760 ("riscv: Cleanup setup_bootmem()") Signed-off-by: Bin Meng <bmeng.cn@gmail.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
* \| \| \| \| \|	ACPI: fix NULL pointer dereference	Linus Torvalds	2021-07-24	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 71f642833284 ("ACPI: utils: Fix reference counting in for_each_acpi_dev_match()") started doing "acpi_dev_put()" on a pointer that was possibly NULL. That fails miserably, because that helper inline function is not set up to handle that case. Just make acpi_dev_put() silently accept a NULL pointer, rather than calling down to put_device() with an invalid offset off that NULL pointer. Link: https://lore.kernel.org/lkml/a607c149-6bf6-0fd0-0e31-100378504da2@kernel.dk/ Reported-and-tested-by: Jens Axboe <axboe@kernel.dk> Tested-by: Daniel Scally <djrscally@gmail.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \| \| \| \|	Merge tag 'scsi-fixes' of ↵	Linus Torvalds	2021-07-24	6	-92/+78
\|\ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four fixes, all in drivers, all of which can lead to user visible problems in certain situations" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: target: Fix NULL dereference on XCOPY completion scsi: mpt3sas: Transition IOC to Ready state during shutdown scsi: target: Fix protect handling in WRITE SAME(32) scsi: iscsi: Fix iface sysfs attr detection