summaryrefslogtreecommitdiffstats
path: root/kernel
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'core-urgent-2021-01-31' of ↵Linus Torvalds2021-01-311-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull single stepping fix from Thomas Gleixner: "A single fix for the single step reporting regression caused by getting the condition wrong when moving SYSCALL_EMU away from TIF flags" [ There's apparently another problem too, fix pending ] * tag 'core-urgent-2021-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: entry: Unbreak single step reporting behaviour
| * entry: Unbreak single step reporting behaviourYuxuan Shui2021-01-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The move of TIF_SYSCALL_EMU to SYSCALL_WORK_SYSCALL_EMU broke single step reporting. The original code reported the single step when TIF_SINGLESTEP was set and TIF_SYSCALL_EMU was not set. The SYSCALL_WORK conversion got the logic wrong and now the reporting only happens when both bits are set. Restore the original behaviour. [ tglx: Massaged changelog and dropped the pointless double negation ] Fixes: 64eb35f701f0 ("ptrace: Migrate TIF_SYSCALL_EMU to use SYSCALL_WORK flag") Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com> Link: https://lore.kernel.org/r/877do3gaq9.fsf@m5Zedd9JOGzJrf0
* | Merge tag 'pm-5.11-rc6' of ↵Linus Torvalds2021-01-292-3/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix a deadlock in the 'kexec jump' code and address a possible hibernation image creation issue. Specifics: - Fix a deadlock caused by attempting to acquire the same mutex twice in a row in the "kexec jump" code (Baoquan He) - Modify the hibernation image saving code to flush the unwritten data to the swap storage later so as to avoid failing to write the image signature which is possible in some cases (Laurent Badel)" * tag 'pm-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: hibernate: flush swap writer after marking kernel: kexec: remove the lock operation of system_transition_mutex
| * | PM: hibernate: flush swap writer after markingLaurent Badel2021-01-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Flush the swap writer after, not before, marking the files, to ensure the signature is properly written. Fixes: 6f612af57821 ("PM / Hibernate: Group swap ops") Signed-off-by: Laurent Badel <laurentbadel@eaton.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | kernel: kexec: remove the lock operation of system_transition_mutexBaoquan He2021-01-251-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Function kernel_kexec() is called with lock system_transition_mutex held in reboot system call. While inside kernel_kexec(), it will acquire system_transition_mutex agin. This will lead to dead lock. The dead lock should be easily triggered, it hasn't caused any failure report just because the feature 'kexec jump' is almost not used by anyone as far as I know. An inquiry can be made about who is using 'kexec jump' and where it's used. Before that, let's simply remove the lock operation inside CONFIG_KEXEC_JUMP ifdeffery scope. Fixes: 55f2503c3b69 ("PM / reboot: Eliminate race between reboot and suspend") Signed-off-by: Baoquan He <bhe@redhat.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Pingfan Liu <kernelfans@gmail.com> Cc: 4.19+ <stable@vger.kernel.org> # 4.19+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | | futex: Handle faults correctly for PI futexesThomas Gleixner2021-01-261-37/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fixup_pi_state_owner() tries to ensure that the state of the rtmutex, pi_state and the user space value related to the PI futex are consistent before returning to user space. In case that the user space value update faults and the fault cannot be resolved by faulting the page in via fault_in_user_writeable() the function returns with -EFAULT and leaves the rtmutex and pi_state owner state inconsistent. A subsequent futex_unlock_pi() operates on the inconsistent pi_state and releases the rtmutex despite not owning it which can corrupt the RB tree of the rtmutex and cause a subsequent kernel stack use after free. It was suggested to loop forever in fixup_pi_state_owner() if the fault cannot be resolved, but that results in runaway tasks which is especially undesired when the problem happens due to a programming error and not due to malice. As the user space value cannot be fixed up, the proper solution is to make the rtmutex and the pi_state consistent so both have the same owner. This leaves the user space value out of sync. Any subsequent operation on the futex will fail because the 10th rule of PI futexes (pi_state owner and user space value are consistent) has been violated. As a consequence this removes the inept attempts of 'fixing' the situation in case that the current task owns the rtmutex when returning with an unresolvable fault by unlocking the rtmutex which left pi_state::owner and rtmutex::owner out of sync in a different and only slightly less dangerous way. Fixes: 1b7558e457ed ("futexes: fix fault handling in futex_lock_pi") Reported-by: gzobqq@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org
* | | futex: Simplify fixup_pi_state_owner()Thomas Gleixner2021-01-261-27/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | Too many gotos already and an upcoming fix would make it even more unreadable. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org
* | | futex: Use pi_state_update_owner() in put_pi_state()Thomas Gleixner2021-01-261-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | No point in open coding it. This way it gains the extra sanity checks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org
* | | rtmutex: Remove unused argument from rt_mutex_proxy_unlock()Thomas Gleixner2021-01-263-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Nothing uses the argument. Remove it as preparation to use pi_state_update_owner(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org
* | | futex: Provide and use pi_state_update_owner()Thomas Gleixner2021-01-261-33/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updating pi_state::owner is done at several places with the same code. Provide a function for it and use that at the obvious places. This is also a preparation for a bug fix to avoid yet another copy of the same code or alternatively introducing a completely unpenetratable mess of gotos. Originally-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org
* | | futex: Replace pointless printk in fixup_owner()Thomas Gleixner2021-01-261-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If that unexpected case of inconsistent arguments ever happens then the futex state is left completely inconsistent and the printk is not really helpful. Replace it with a warning and make the state consistent. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org
* | | futex: Ensure the correct return value from futex_lock_pi()Thomas Gleixner2021-01-261-15/+16
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case that futex_lock_pi() was aborted by a signal or a timeout and the task returned without acquiring the rtmutex, but is the designated owner of the futex due to a concurrent futex_unlock_pi() fixup_owner() is invoked to establish consistent state. In that case it invokes fixup_pi_state_owner() which in turn tries to acquire the rtmutex again. If that succeeds then it does not propagate this success to fixup_owner() and futex_lock_pi() returns -EINTR or -ETIMEOUT despite having the futex locked. Return success from fixup_pi_state_owner() in all cases where the current task owns the rtmutex and therefore the futex and propagate it correctly through fixup_owner(). Fixup the other callsite which does not expect a positive return value. Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org
* | Merge tag 'printk-for-5.11-urgent-fixup' of ↵Linus Torvalds2021-01-251-1/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk fix from Petr Mladek: "The fix of a potential buffer overflow in 5.11-rc5 introduced another one. The trailing '\0' might be written up to the message "len" past the buffer. Fortunately, it is not that easy to hit. Most readers use 1kB buffers for a single message. Typical messages fit into the temporary buffer with enough reserve. Also readers do not rely on the '\0'. It is related to the previous fix. Some readers required the space for the trailing '\0'. We decided to write it there to avoid such regressions in the future. The most realistic victims are dumpers using kmsg_dump_get_buffer(). They are filling the entire buffer with as many messages as possible. They are typically used when handling panic()" * tag 'printk-for-5.11-urgent-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: fix string termination for record_print_text()
| * Merge branch 'printk-rework' into for-linusPetr Mladek2021-01-251-1/+1
| |\
| | * printk: fix string termination for record_print_text()John Ogness2021-01-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit f0e386ee0c0b ("printk: fix buffer overflow potential for print_text()") added string termination in record_print_text(). However it used the wrong base pointer for adding the terminator. This led to a 0-byte being written somewhere beyond the buffer. Use the correct base pointer when adding the terminator. Fixes: f0e386ee0c0b ("printk: fix buffer overflow potential for print_text()") Reported-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20210124202728.4718-1-john.ogness@linutronix.de
* | | Merge tag 'irq_urgent_for_v5.11_rc5' of ↵Linus Torvalds2021-01-242-1/+2
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Fix a kernel panic in mips-cpu due to invalid irq domain hierarchy. - Fix to not lose IPIs on bcm2836. - Fix for a bogus marking of ITS devices as shared due to unitialized stack variable. - Clear a phantom interrupt on qcom-pdc to unblock suspend. - Small cleanups, warning and build fixes. * tag 'irq_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Export irq_check_status_bit() irqchip/mips-cpu: Set IPI domain parent chip irqchip/pruss: Simplify the TI_PRUSS_INTC Kconfig irqchip/loongson-liointc: Fix build warnings driver core: platform: Add extra error check in devm_platform_get_irqs_affinity() irqchip/bcm2836: Fix IPI acknowledgement after conversion to handle_percpu_devid_irq irqchip/irq-sl28cpld: Convert comma to semicolon genirq/msi: Initialize msi_alloc_info before calling msi_domain_prepare_irqs()
| * | | genirq: Export irq_check_status_bit()Thomas Gleixner2021-01-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the users can be built modular: ERROR: modpost: "irq_check_status_bit" [drivers/perf/arm_spe_pmu.ko] undefined! Fixes: fdd029630434 ("genirq: Move status flag checks to core") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201227192049.GA195845@roeck-us.net
| * | | Merge tag 'irqchip-fixes-5.11-1' of ↵Thomas Gleixner2021-01-121-1/+1
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zyngier: - Fix the MIPS CPU interrupt controller hierarchy - Simplify the PRUSS Kconfig entry - Eliminate trivial build warnings on the MIPS Loongson liointc - Fix error path in devm_platform_get_irqs_affinity() - Turn the BCM2836 IPI irq_eoi callback into irq_ack - Fix initialisation of on-stack msi_alloc_info - Cleanup spurious comma in irq-sl28cpld Link: https://lore.kernel.org/r/20210110110001.2328708-1-maz@kernel.org
| | * | | genirq/msi: Initialize msi_alloc_info before calling msi_domain_prepare_irqs()Zenghui Yu2020-12-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 5fe71d271df8 ("irqchip/gic-v3-its: Tag ITS device as shared if allocating for a proxy device"), some of the devices are wrongly marked as "shared" by the ITS driver on systems equipped with the ITS(es). The problem is that the @info->flags may not be initialized anywhere and we end up looking at random bits on the stack. That's obviously not good. We can perform the initialization in the IRQ core layer before calling msi_domain_prepare_irqs(), which is neat enough. Fixes: 5fe71d271df8 ("irqchip/gic-v3-its: Tag ITS device as shared if allocating for a proxy device") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201218060039.1770-1-yuzenghui@huawei.com
* | | | | Merge tag 'sched_urgent_for_v5.11_rc5' of ↵Linus Torvalds2021-01-245-33/+129
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Correct the marking of kthreads which are supposed to run on a specific, single CPU vs such which are affine to only one CPU, mark per-cpu workqueue threads as such and make sure that marking "survives" CPU hotplug. Fix CPU hotplug issues with such kthreads. - A fix to not push away tasks on CPUs coming online. - Have workqueue CPU hotplug code use cpu_possible_mask when breaking affinity on CPU offlining so that pending workers can finish on newly arrived onlined CPUs too. - Dump tasks which haven't vacated a CPU which is currently being unplugged. - Register a special scale invariance callback which gets called on resume from RAM to read out APERF/MPERF after resume and thus make the schedutil scaling governor more precise. * tag 'sched_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Relax the set_cpus_allowed_ptr() semantics sched: Fix CPU hotplug / tighten is_per_cpu_kthread() sched: Prepare to use balance_push in ttwu() workqueue: Restrict affinity change to rescuer workqueue: Tag bound workers with KTHREAD_IS_PER_CPU kthread: Extract KTHREAD_IS_PER_CPU sched: Don't run cpu-online with balance_push() enabled workqueue: Use cpu_possible_mask instead of cpu_active_mask to break affinity sched/core: Print out straggler tasks in sched_cpu_dying() x86: PM: Register syscore_ops for scale invariance
| * | | | | sched: Relax the set_cpus_allowed_ptr() semanticsPeter Zijlstra2021-01-221-11/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have KTHREAD_IS_PER_CPU to denote the critical per-cpu tasks to retain during CPU offline, we can relax the warning in set_cpus_allowed_ptr(). Any spurious kthread that wants to get on at the last minute will get pushed off before it can run. While during CPU online there is no harm, and actual benefit, to allowing kthreads back on early, it simplifies hotplug code and fixes a number of outstanding races. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Lai jiangshan <jiangshanlai@gmail.com> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210121103507.240724591@infradead.org
| * | | | | sched: Fix CPU hotplug / tighten is_per_cpu_kthread()Peter Zijlstra2021-01-221-4/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to commit 1cf12e08bc4d ("sched/hotplug: Consolidate task migration on CPU unplug") we'd leave any task on the dying CPU and break affinity and force them off at the very end. This scheme had to change in order to enable migrate_disable(). One cannot wait for migrate_disable() to complete while stuck in stop_machine(). Furthermore, since we need at the very least: idle, hotplug and stop threads at any point before stop_machine, we can't break affinity and/or push those away. Under the assumption that all per-cpu kthreads are sanely handled by CPU hotplug, the new code no long breaks affinity or migrates any of them (which then includes the critical ones above). However, there's an important difference between per-cpu kthreads and kthreads that happen to have a single CPU affinity which is lost. The latter class very much relies on the forced affinity breaking and migration semantics previously provided. Use the new kthread_is_per_cpu() infrastructure to tighten is_per_cpu_kthread() and fix the hot-unplug problems stemming from the change. Fixes: 1cf12e08bc4d ("sched/hotplug: Consolidate task migration on CPU unplug") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210121103507.102416009@infradead.org
| * | | | | sched: Prepare to use balance_push in ttwu()Peter Zijlstra2021-01-222-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation of using the balance_push state in ttwu() we need it to provide a reliable and consistent state. The immediate problem is that rq->balance_callback gets cleared every schedule() and then re-set in the balance_push_callback() itself. This is not a reliable signal, so add a variable that stays set during the entire time. Also move setting it before the synchronize_rcu() in sched_cpu_deactivate(), such that we get guaranteed visibility to ttwu(), which is a preempt-disable region. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210121103506.966069627@infradead.org
| * | | | | workqueue: Restrict affinity change to rescuerPeter Zijlstra2021-01-221-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | create_worker() will already set the right affinity using kthread_bind_mask(), this means only the rescuer will need to change it's affinity. Howveer, while in cpu-hot-unplug a regular task is not allowed to run on online&&!active as it would be pushed away quite agressively. We need KTHREAD_IS_PER_CPU to survive in that environment. Therefore set the affinity after getting that magic flag. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210121103506.826629830@infradead.org
| * | | | | workqueue: Tag bound workers with KTHREAD_IS_PER_CPUPeter Zijlstra2021-01-221-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark the per-cpu workqueue workers as KTHREAD_IS_PER_CPU. Workqueues have unfortunate semantics in that per-cpu workers are not default flushed and parked during hotplug, however a subset does manual flush on hotplug and hard relies on them for correctness. Therefore play silly games.. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210121103506.693465814@infradead.org
| * | | | | kthread: Extract KTHREAD_IS_PER_CPUPeter Zijlstra2021-01-222-1/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a need to distinguish geniune per-cpu kthreads from kthreads that happen to have a single CPU affinity. Geniune per-cpu kthreads are kthreads that are CPU affine for correctness, these will obviously have PF_KTHREAD set, but must also have PF_NO_SETAFFINITY set, lest userspace modify their affinity and ruins things. However, these two things are not sufficient, PF_NO_SETAFFINITY is also set on other tasks that have their affinities controlled through other means, like for instance workqueues. Therefore another bit is needed; it turns out kthread_create_per_cpu() already has such a bit: KTHREAD_IS_PER_CPU, which is used to make kthread_park()/kthread_unpark() work correctly. Expose this flag and remove the implicit setting of it from kthread_create_on_cpu(); the io_uring usage of it seems dubious at best. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210121103506.557620262@infradead.org
| * | | | | sched: Don't run cpu-online with balance_push() enabledPeter Zijlstra2021-01-221-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't need to push away tasks when we come online, mark the push complete right before the CPU dies. XXX hotplug state machine has trouble with rollback here. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210121103506.415606087@infradead.org
| * | | | | workqueue: Use cpu_possible_mask instead of cpu_active_mask to break affinityLai Jiangshan2021-01-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The scheduler won't break affinity for us any more, and we should "emulate" the same behavior when the scheduler breaks affinity for us. The behavior is "changing the cpumask to cpu_possible_mask". And there might be some other CPUs online later while the worker is still running with the pending work items. The worker should be allowed to use the later online CPUs as before and process the work items ASAP. If we use cpu_active_mask here, we can't achieve this goal but using cpu_possible_mask can. Fixes: 06249738a41a ("workqueue: Manually break affinity on hotplug") Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Acked-by: Tejun Heo <tj@kernel.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210111152638.2417-4-jiangshanlai@gmail.com
| * | | | | sched/core: Print out straggler tasks in sched_cpu_dying()Valentin Schneider2021-01-221-1/+23
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 1cf12e08bc4d ("sched/hotplug: Consolidate task migration on CPU unplug") tasks are expected to move themselves out of a out-going CPU. For most tasks this will be done automagically via BALANCE_PUSH, but percpu kthreads will have to cooperate and move themselves away one way or another. Currently, some percpu kthreads (workqueues being a notable exemple) do not cooperate nicely and can end up on an out-going CPU at the time sched_cpu_dying() is invoked. Print the dying rq's tasks to shed some light on the stragglers. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210113183141.11974-1-valentin.schneider@arm.com
* | | | | Merge tag 'timers_urgent_for_v5.11_rc5' of ↵Linus Torvalds2021-01-242-4/+3
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Borislav Petkov: - Fix an integer overflow in the NTP RTC synchronization which led to the latter happening every 2 seconds instead of the intended every 11 minutes. - Get rid of now unused get_seconds(). * tag 'timers_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ntp: Fix RTC synchronization on 32-bit platforms timekeeping: Remove unused get_seconds()
| * | | | | ntp: Fix RTC synchronization on 32-bit platformsGeert Uytterhoeven2021-01-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to an integer overflow, RTC synchronization now happens every 2s instead of the intended 11 minutes. Fix this by forcing 64-bit arithmetic for the sync period calculation. Annotate the other place which multiplies seconds for consistency as well. Fixes: c9e6189fb03123a7 ("ntp: Make the RTC synchronization more reliable") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210111103956.290378-1-geert+renesas@glider.be
| * | | | | timekeeping: Remove unused get_seconds()Chunguang Xu2021-01-121-2/+1
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The get_seconds() cleanup seems to have been completed, now it is time to delete the legacy interface to avoid misuse later. Signed-off-by: Chunguang Xu <brookxu@tencent.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/1606816351-26900-1-git-send-email-brookxu@tencent.com
* | | | | Merge tag 'x86_urgent_for_v5.11_rc5' of ↵Linus Torvalds2021-01-241-2/+7
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add a new Intel model number for Alder Lake - Differentiate which aspects of the FPU state get saved/restored when the FPU is used in-kernel and fix a boot crash on K7 due to early MXCSR access before CR4.OSFXSR is even set. - A couple of noinstr annotation fixes - Correct die ID setting on AMD for users of topology information which need the correct die ID - A SEV-ES fix to handle string port IO to/from kernel memory properly * tag 'x86_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add another Alder Lake CPU to the Intel family x86/mmx: Use KFPU_387 for MMX string operations x86/fpu: Add kernel_fpu_begin_mask() to selectively initialize state x86/topology: Make __max_die_per_package available unconditionally x86: __always_inline __{rd,wr}msr() x86/mce: Remove explicit/superfluous tracing locking/lockdep: Avoid noinstr warning for DEBUG_LOCKDEP locking/lockdep: Cure noinstr fail x86/sev: Fix nonistr violation x86/entry: Fix noinstr fail x86/cpu/amd: Set __max_die_per_package on AMD x86/sev-es: Handle string port IO to kernel memory properly
| * | | | | locking/lockdep: Avoid noinstr warning for DEBUG_LOCKDEPPeter Zijlstra2021-01-121-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vmlinux.o: warning: objtool: lock_is_held_type()+0x60: call to check_flags.part.0() leaves .noinstr.text section Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210106144017.652218215@infradead.org
| * | | | | locking/lockdep: Cure noinstr failPeter Zijlstra2021-01-121-1/+1
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the compiler doesn't feel like inlining, it causes a noinstr fail: vmlinux.o: warning: objtool: lock_is_held_type()+0xb: call to lockdep_enabled() leaves .noinstr.text section Fixes: 4d004099a668 ("lockdep: Fix lockdep recursion") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210106144017.592595176@infradead.org
* | | | | Merge tag 'for-linus-2021-01-24' of ↵Linus Torvalds2021-01-243-6/+5
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull misc fixes from Christian Brauner: - Jann reported sparse complaints because of a missing __user annotation in a helper we added way back when we added pidfd_send_signal() to avoid compat syscall handling. Fix it. - Yanfei replaces a reference in a comment to the _do_fork() helper I removed a while ago with a reference to the new kernel_clone() replacement - Alexander Guril added a simple coding style fix * tag 'for-linus-2021-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: kthread: remove comments about old _do_fork() helper Kernel: fork.c: Fix coding style: Do not use {} around single-line statements signal: Add missing __user annotation to copy_siginfo_from_user_any
| * | | | | kthread: remove comments about old _do_fork() helperYanfei Xu2021-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The old _do_fork() helper has been removed in favor of kernel_clone(). Here correct some comments which still contain _do_fork() Link: https://lore.kernel.org/r/20210111104807.18022-1-yanfei.xu@windriver.com Cc: christian@brauner.io Cc: linux-kernel@vger.kernel.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Yanfei Xu <yanfei.xu@windriver.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
| * | | | | Kernel: fork.c: Fix coding style: Do not use {} around single-line statementsAlexander Guril2021-01-111-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixed two coding style issues in kernel/fork.c Do not use {} around single-line statements. Cc: linux-kernel@vger.kernel.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Alexander Guril <alexander.guril02@gmail.com> Link: https://lore.kernel.org/r/20201226114021.2589-1-alexander.guril02@gmail.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
| * | | | | signal: Add missing __user annotation to copy_siginfo_from_user_anyJann Horn2021-01-111-1/+2
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | copy_siginfo_from_user_any() takes a userspace pointer as second argument; annotate the parameter type accordingly. Signed-off-by: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/r/20201207000252.138564-1-jannh@google.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
* | | | | Merge tag 'printk-for-5.11-printk-rework-fixup' of ↵Linus Torvalds2021-01-212-12/+30
|\ \ \ \ \ | | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk fixes from Petr Mladek: - Fix line counting and buffer size calculation. Both regressions caused that a reader buffer might not get filled as much as possible. - Restore non-documented behavior of printk() reader API and make it official. It did not fill the last byte of the provided buffer before 5.10. Two architectures, powerpc and um, used it to add the trailing '\0'. There might theoretically be more callers depending on this behavior in userspace. * tag 'printk-for-5.11-printk-rework-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: fix buffer overflow potential for print_text() printk: fix kmsg_dump_get_buffer length calulations printk: ringbuffer: fix line counting
| * | | | Merge branch 'printk-rework' into for-linusPetr Mladek2021-01-212-12/+30
| |\ \ \ \ | | | |_|/ | | |/| |
| | * | | printk: fix buffer overflow potential for print_text()John Ogness2021-01-191-9/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before the commit 896fbe20b4e2333fb55 ("printk: use the lockless ringbuffer"), msg_print_text() would only write up to size-1 bytes into the provided buffer. Some callers expect this behavior and append a terminator to returned string. In particular: arch/powerpc/xmon/xmon.c:dump_log_buf() arch/um/kernel/kmsg_dump.c:kmsg_dumper_stdout() msg_print_text() has been replaced by record_print_text(), which currently fills the full size of the buffer. This causes a buffer overflow for the above callers. Change record_print_text() so that it will only use size-1 bytes for text data. Also, for paranoia sakes, add a terminator after the text data. And finally, document this behavior so that it is clear that only size-1 bytes are used and a terminator is added. Fixes: 896fbe20b4e2333fb55 ("printk: use the lockless ringbuffer") Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20210114170412.4819-1-john.ogness@linutronix.de
| | * | | printk: fix kmsg_dump_get_buffer length calulationsJohn Ogness2021-01-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kmsg_dump_get_buffer() uses @syslog to determine if the syslog prefix should be written to the buffer. However, when calculating the maximum number of records that can fit into the buffer, it always counts the bytes from the syslog prefix. Use @syslog when calculating the maximum number of records that can fit into the buffer. Fixes: e2ae715d66bf ("kmsg - kmsg_dump() use iterator to receive log buffer content") Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20210113164413.1599-1-john.ogness@linutronix.de
| | * | | printk: ringbuffer: fix line countingJohn Ogness2021-01-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Counting text lines in a record simply involves counting the number of newline characters (+1). However, it is searching the full data block for newline characters, even though the text data can be (and often is) a subset of that area. Since the extra area in the data block was never initialized, the result is that extra newlines may be seen and counted. Restrict newline searching to the text data length. Fixes: b6cf8b3f3312 ("printk: add lockless ringbuffer") Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20210113144234.6545-1-john.ogness@linutronix.de
* | | | | Merge tag 'net-5.11-rc5' of ↵Linus Torvalds2021-01-207-13/+24
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.11-rc5, including fixes from bpf, wireless, and can trees. Current release - regressions: - nfc: nci: fix the wrong NCI_CORE_INIT parameters Current release - new code bugs: - bpf: allow empty module BTFs Previous releases - regressions: - bpf: fix signed_{sub,add32}_overflows type handling - tcp: do not mess with cloned skbs in tcp_add_backlog() - bpf: prevent double bpf_prog_put call from bpf_tracing_prog_attach - bpf: don't leak memory in bpf getsockopt when optlen == 0 - tcp: fix potential use-after-free due to double kfree() - mac80211: fix encryption issues with WEP - devlink: use right genl user_ptr when handling port param get/set - ipv6: set multicast flag on the multicast route - tcp: fix TCP_USER_TIMEOUT with zero window Previous releases - always broken: - bpf: local storage helpers should check nullness of owner ptr passed - mac80211: fix incorrect strlen of .write in debugfs - cls_flower: call nla_ok() before nla_next() - skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too" * tag 'net-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits) net: systemport: free dev before on error path net: usb: cdc_ncm: don't spew notifications net: mscc: ocelot: Fix multicast to the CPU port tcp: Fix potential use-after-free due to double kfree() bpf: Fix signed_{sub,add32}_overflows type handling can: peak_usb: fix use after free bugs can: vxcan: vxcan_xmit: fix use after free bug can: dev: can_restart: fix use after free bug tcp: fix TCP socket rehash stats mis-accounting net: dsa: b53: fix an off by one in checking "vlan->vid" tcp: do not mess with cloned skbs in tcp_add_backlog() selftests: net: fib_tests: remove duplicate log test net: nfc: nci: fix the wrong NCI_CORE_INIT parameters sh_eth: Fix power down vs. is_opened flag ordering net: Disable NETIF_F_HW_TLS_RX when RXCSUM is disabled netfilter: rpfilter: mask ecn bits before fib lookup udp: mask TOS bits in udp_v4_early_demux() xsk: Clear pool even for inactive queues bpf: Fix helper bpf_map_peek_elem_proto pointing to wrong callback sh_eth: Make PHY access aware of Runtime PM to fix reboot crash ...
| * | | | | bpf: Fix signed_{sub,add32}_overflows type handlingDaniel Borkmann2021-01-201-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix incorrect signed_{sub,add32}_overflows() input types (and a related buggy comment). It looks like this might have slipped in via copy/paste issue, also given prior to 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking") the signature of signed_sub_overflows() had s64 a and s64 b as its input args whereas now they are truncated to s32. Thus restore proper types. Also, the case of signed_add32_overflows() is not consistent to signed_sub32_overflows(). Both have s32 as inputs, therefore align the former. Fixes: 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking") Reported-by: De4dCr0w <sa516203@mail.ustc.edu.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>
| * | | | | bpf: Fix helper bpf_map_peek_elem_proto pointing to wrong callbackMircea Cirjaliu2021-01-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I assume this was obtained by copy/paste. Point it to bpf_map_peek_elem() instead of bpf_map_pop_elem(). In practice it may have been less likely hit when under JIT given shielded via 84430d4232c3 ("bpf, verifier: avoid retpoline for map push/pop/peek operation"). Fixes: f1a2e44a3aec ("bpf: add queue and stack maps") Signed-off-by: Mircea Cirjaliu <mcirjaliu@bitdefender.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Mauricio Vasquez <mauriciovasquezbernal@gmail.com> Link: https://lore.kernel.org/bpf/AM7PR02MB6082663DFDCCE8DA7A6DD6B1BBA30@AM7PR02MB6082.eurprd02.prod.outlook.com
| * | | | | Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski2021-01-156-9/+20
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf 2021-01-16 1) Fix a double bpf_prog_put() for BPF_PROG_{TYPE_EXT,TYPE_TRACING} types in link creation's error path causing a refcount underflow, from Jiri Olsa. 2) Fix BTF validation errors for the case where kernel modules don't declare any new types and end up with an empty BTF, from Andrii Nakryiko. 3) Fix BPF local storage helpers to first check their {task,inode} owners for being NULL before access, from KP Singh. 4) Fix a memory leak in BPF setsockopt handling for the case where optlen is zero and thus temporary optval buffer should be freed, from Stanislav Fomichev. 5) Fix a syzbot memory allocation splat in BPF_PROG_TEST_RUN infra for raw_tracepoint caused by too big ctx_size_in, from Song Liu. 6) Fix LLVM code generation issues with verifier where PTR_TO_MEM{,_OR_NULL} registers were spilled to stack but not recognized, from Gilad Reti. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: MAINTAINERS: Update my email address selftests/bpf: Add verifier test for PTR_TO_MEM spill bpf: Support PTR_TO_MEM{,_OR_NULL} register spilling bpf: Reject too big ctx_size_in for raw_tp test run libbpf: Allow loading empty BTFs bpf: Allow empty module BTFs bpf: Don't leak memory in bpf getsockopt when optlen == 0 bpf: Update local storage test to check handling of null ptrs bpf: Fix typo in bpf_inode_storage.c bpf: Local storage helpers should check nullness of owner ptr passed bpf: Prevent double bpf_prog_put call from bpf_tracing_prog_attach ==================== Link: https://lore.kernel.org/r/20210116002025.15706-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * | | | | bpf: Support PTR_TO_MEM{,_OR_NULL} register spillingGilad Reti2021-01-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for pointer to mem register spilling, to allow the verifier to track pointers to valid memory addresses. Such pointers are returned for example by a successful call of the bpf_ringbuf_reserve helper. The patch was partially contributed by CyberArk Software, Inc. Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it") Suggested-by: Yonghong Song <yhs@fb.com> Signed-off-by: Gilad Reti <gilad.reti@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/bpf/20210113053810.13518-1-gilad.reti@gmail.com
| | * | | | | bpf: Allow empty module BTFsAndrii Nakryiko2021-01-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some modules don't declare any new types and end up with an empty BTF, containing only valid BTF header and no types or strings sections. This currently causes BTF validation error. There is nothing wrong with such BTF, so fix the issue by allowing module BTFs with no types or strings. Fixes: 36e68442d1af ("bpf: Load and verify kernel module BTFs") Reported-by: Christopher William Snowhill <chris@kode54.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210110070341.1380086-1-andrii@kernel.org