summaryrefslogtreecommitdiffstats
path: root/kernel/time
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'core-rcu-2021-02-17' of ↵Linus Torvalds2021-02-211-0/+14
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "These are the latest RCU updates for v5.12: - Documentation updates. - Miscellaneous fixes. - kfree_rcu() updates: Addition of mem_dump_obj() to provide allocator return addresses to more easily locate bugs. This has a couple of RCU-related commits, but is mostly MM. Was pulled in with akpm's agreement. - Per-callback-batch tracking of numbers of callbacks, which enables better debugging information and smarter reactions to large numbers of callbacks. - The first round of changes to allow CPUs to be runtime switched from and to callback-offloaded state. - CONFIG_PREEMPT_RT-related changes. - RCU CPU stall warning updates. - Addition of polling grace-period APIs for SRCU. - Torture-test and torture-test scripting updates, including a "torture everything" script that runs rcutorture, locktorture, scftorture, rcuscale, and refscale. Plus does an allmodconfig build. - nolibc fixes for the torture tests" * tag 'core-rcu-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (130 commits) percpu_ref: Dump mem_dump_obj() info upon reference-count underflow rcu: Make call_rcu() print mem_dump_obj() info for double-freed callback mm: Make mem_obj_dump() vmalloc() dumps include start and length mm: Make mem_dump_obj() handle vmalloc() memory mm: Make mem_dump_obj() handle NULL and zero-sized pointers mm: Add mem_dump_obj() to print source of memory block tools/rcutorture: Fix position of -lgcc in mkinitrd.sh tools/nolibc: Fix position of -lgcc in the documented example tools/nolibc: Emit detailed error for missing alternate syscall number definitions tools/nolibc: Remove incorrect definitions of __ARCH_WANT_* tools/nolibc: Get timeval, timespec and timezone from linux/time.h tools/nolibc: Implement poll() based on ppoll() tools/nolibc: Implement fork() based on clone() tools/nolibc: Make getpgrp() fall back to getpgid(0) tools/nolibc: Make dup2() rely on dup3() when available tools/nolibc: Add the definition for dup() rcutorture: Add rcutree.use_softirq=0 to RUDE01 and TASKS01 torture: Maintain torture-specific set of CPUs-online books torture: Clean up after torture-test CPU hotplugging rcutorture: Make object_debug also double call_rcu() heap object ...
| * Merge branch 'for-mingo-rcu' of ↵Ingo Molnar2021-02-121-0/+14
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Documentation updates. - Miscellaneous fixes. - kfree_rcu() updates: Addition of mem_dump_obj() to provide allocator return addresses to more easily locate bugs. This has a couple of RCU-related commits, but is mostly MM. Was pulled in with akpm's agreement. - Per-callback-batch tracking of numbers of callbacks, which enables better debugging information and smarter reactions to large numbers of callbacks. - The first round of changes to allow CPUs to be runtime switched from and to callback-offloaded state. - CONFIG_PREEMPT_RT-related changes. - RCU CPU stall warning updates. - Addition of polling grace-period APIs for SRCU. - Torture-test and torture-test scripting updates, including a "torture everything" script that runs rcutorture, locktorture, scftorture, rcuscale, and refscale. Plus does an allmodconfig build. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * rcu/nocb: Code-style nits in callback-offloading togglingPaul E. McKenney2021-01-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | This commit addresses a few code-style nits in callback-offloading toggling, including one that predates this toggling. Cc: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
| | * timer: Add timer_curr_running()Frederic Weisbecker2021-01-061-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a timer_curr_running() function that verifies that the current code is running in the context of the specified timer's handler. Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Tested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
* | | Merge tag 'timers-core-2021-02-15' of ↵Linus Torvalds2021-02-212-7/+7
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Time and timer updates: - Instead of new drivers remove tango, sirf, u300 and atlas drivers - Add suspend/resume support for microchip pit64b - The usual fixes, improvements and cleanups here and there" * tag 'timers-core-2021-02-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timens: Delete no-op time_ns_init() alarmtimer: Update kerneldoc clocksource/drivers/timer-microchip-pit64b: Add clocksource suspend/resume clocksource/drivers/prima: Remove sirf prima driver clocksource/drivers/atlas: Remove sirf atlas driver clocksource/drivers/tango: Remove tango driver clocksource/drivers/u300: Remove the u300 driver dt-bindings: timer: nuvoton: Clarify that interrupt of timer 0 should be specified clocksource/drivers/davinci: Move pr_fmt() before the includes clocksource/drivers/efm32: Drop unused timer code
| * | timens: Delete no-op time_ns_init()Alexey Dobriyan2021-02-051-6/+0
| | | | | | | | | | | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andrei Vagin <avagin@gmail.com> Link: https://lore.kernel.org/r/20201228215402.GA572900@localhost.localdomain
| * | alarmtimer: Update kerneldocAlexandre Belloni2021-02-051-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update kerneldoc comments to reflect the actual arguments and return values of the documented functions. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210202013457.3482388-1-alexandre.belloni@bootlin.com
* | | ntp: Use freezable workqueue for RTC synchronizationGeert Uytterhoeven2021-02-051-2/+2
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bug fixed by commit e3fab2f3de081e98 ("ntp: Fix RTC synchronization on 32-bit platforms") revealed an underlying issue: RTC synchronization may happen anytime, even while the system is partially suspended. On systems where the RTC is connected to an I2C bus, the I2C bus controller may already or still be suspended, triggering a WARNING during suspend or resume from s2ram: WARNING: CPU: 0 PID: 124 at drivers/i2c/i2c-core.h:54 __i2c_transfer+0x634/0x680 i2c i2c-6: Transfer while suspended [...] Workqueue: events_power_efficient sync_hw_clock [...] (__i2c_transfer) (i2c_transfer) (regmap_i2c_read) ... (da9063_rtc_set_time) (rtc_set_time) (sync_hw_clock) (process_one_work) Fix this race condition by using the freezable instead of the normal power-efficient workqueue. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20210125143039.1051912-1-geert+renesas@glider.be
* | ntp: Fix RTC synchronization on 32-bit platformsGeert Uytterhoeven2021-01-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to an integer overflow, RTC synchronization now happens every 2s instead of the intended 11 minutes. Fix this by forcing 64-bit arithmetic for the sync period calculation. Annotate the other place which multiplies seconds for consistency as well. Fixes: c9e6189fb03123a7 ("ntp: Make the RTC synchronization more reliable") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210111103956.290378-1-geert+renesas@glider.be
* | timekeeping: Remove unused get_seconds()Chunguang Xu2021-01-121-2/+1
|/ | | | | | | | | | The get_seconds() cleanup seems to have been completed, now it is time to delete the legacy interface to avoid misuse later. Signed-off-by: Chunguang Xu <brookxu@tencent.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/1606816351-26900-1-git-send-email-brookxu@tencent.com
* Merge tag 'timers-urgent-2020-12-27' of ↵Linus Torvalds2020-12-273-15/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Update/fix two CPU sanity checks in the hotplug and the boot code, and fix a typo in the Kconfig help text. [ Context: the first two commits are the result of an ongoing annotation+review work of (intentional) tick_do_timer_cpu() data races reported by KCSAN, but the annotations aren't fully cooked yet ]" * tag 'timers-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping: Fix spelling mistake in Kconfig "fullfill" -> "fulfill" tick/sched: Remove bogus boot "safety" check tick: Remove pointless cpu valid check in hotplug code
| * timekeeping: Fix spelling mistake in Kconfig "fullfill" -> "fulfill"Colin Ian King2020-12-181-1/+1
| | | | | | | | | | | | | | | | | | | | There is a spelling mistake in the Kconfig help text. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20201217171705.57586-1-colin.king@canonical.com
| * tick/sched: Remove bogus boot "safety" checkThomas Gleixner2020-12-161-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | can_stop_idle_tick() checks whether the do_timer() duty has been taken over by a CPU on boot. That's silly because the boot CPU always takes over with the initial clockevent device. But even if no CPU would have installed a clockevent and taken over the duty then the question whether the tick on the current CPU can be stopped or not is moot. In that case the current CPU would have no clockevent either, so there would be nothing to keep ticking. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20201206212002.725238293@linutronix.de
| * tick: Remove pointless cpu valid check in hotplug codeThomas Gleixner2020-12-161-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tick_handover_do_timer() which is invoked when a CPU is unplugged has a check for cpumask_first(cpu_online_mask) when it tries to hand over the tick update duty. Checking the result of cpumask_first() there is pointless because if the online mask is empty at this point, then this would be the last CPU in the system going offline, which is impossible. There is always at least one CPU remaining. If online mask would be really empty then the timer duty would be the least of the resulting problems. Remove the well meant check simply because it is pointless and confusing. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20201206212002.582579516@linutronix.de
* | Merge tag 'asm-generic-timers-5.11' of ↵Linus Torvalds2020-12-166-58/+48
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic cross-architecture timer cleanup from Arnd Bergmann: "This cleans up two ancient timer features that were never completed in the past, CONFIG_GENERIC_CLOCKEVENTS and CONFIG_ARCH_USES_GETTIMEOFFSET. There was only one user left for the ARCH_USES_GETTIMEOFFSET variant of clocksource implementations, the ARM EBSA110 platform. Rather than changing to use modern timekeeping, we remove the platform entirely as Russell no longer uses his machine and nobody else seems to have one any more. The conditional code for using arch_gettimeoffset() is removed as a result. For CONFIG_GENERIC_CLOCKEVENTS, there are still a couple of platforms not using clockevent drivers: parisc, ia64, most of m68k, and one Arm platform. These all do timer ticks slighly differently, and this gets cleaned up to the point they at least all call the same helper function. Instead of most platforms using 'select GENERIC_CLOCKEVENTS' in Kconfig, the polarity is now reversed, with the few remaining ones selecting LEGACY_TIMER_TICK instead" * tag 'asm-generic-timers-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: timekeeping: default GENERIC_CLOCKEVENTS to enabled timekeeping: remove xtime_update m68k: remove timer_interrupt() function m68k: change remaining timers to legacy_timer_tick m68k: m68328: use legacy_timer_tick() m68k: sun3/sun3c: use legacy_timer_tick m68k: split heartbeat out of timer function m68k: coldfire: use legacy_timer_tick() parisc: use legacy_timer_tick ARM: rpc: use legacy_timer_tick ia64: convert to legacy_timer_tick timekeeping: add CONFIG_LEGACY_TIMER_TICK timekeeping: remove arch_gettimeoffset net: remove am79c961a driver ARM: remove ebsa110 platform
| * | timekeeping: default GENERIC_CLOCKEVENTS to enabledArnd Bergmann2020-10-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Almost all machines use GENERIC_CLOCKEVENTS, so it feels wrong to require each one to select that symbol manually. Instead, enable it whenever CONFIG_LEGACY_TIMER_TICK is disabled as a simplification. It should be possible to select both GENERIC_CLOCKEVENTS and LEGACY_TIMER_TICK from an architecture now and decide at runtime between the two. For the clockevents arch-support.txt file, this means that additional architectures are marked as TODO when they have at least one machine that still uses LEGACY_TIMER_TICK, rather than being marked 'ok' when at least one machine has been converted. This means that both m68k and arm (for riscpc) revert to TODO. At this point, we could just always enable CONFIG_GENERIC_CLOCKEVENTS rather than leaving it off when not needed. I built an m68k defconfig kernel (using gcc-10.1.0) and found that this would add around 5.5KB in kernel image size: text data bss dec hex filename 3861936 1092236 196656 5150828 4e986c obj-m68k/vmlinux-no-clockevent 3866201 1093832 196184 5156217 4ead79 obj-m68k/vmlinux-clockevent On Arm (MACH_RPC), that difference appears to be twice as large, around 11KB on top of an 6MB vmlinux. Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| * | timekeeping: remove xtime_updateArnd Bergmann2020-10-303-18/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are no more users of xtime_update aside from legacy_timer_tick(), so fold it into that function and remove the declaration. update_process_times() is now only called inside of the kernel/time/ code, so the declaration can be moved there. Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| * | timekeeping: add CONFIG_LEGACY_TIMER_TICKArnd Bergmann2020-10-303-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All platforms that currently do not use generic clockevents roughly call the same set of functions in their timer interrupts: xtime_update(), update_process_times() and profile_tick(), sometimes in a different sequence. Add a helper function that performs all three of them, to make the callers more uniform and simplify the interface. Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| * | timekeeping: remove arch_gettimeoffsetArnd Bergmann2020-10-303-41/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | With Arm EBSA110 gone, nothing uses it any more, so the corresponding code and the Kconfig option can be removed. Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* | | Merge tag 'sched-core-2020-12-14' of ↵Linus Torvalds2020-12-141-4/+2
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Thomas Gleixner: - migrate_disable/enable() support which originates from the RT tree and is now a prerequisite for the new preemptible kmap_local() API which aims to replace kmap_atomic(). - A fair amount of topology and NUMA related improvements - Improvements for the frequency invariant calculations - Enhanced robustness for the global CPU priority tracking and decision making - The usual small fixes and enhancements all over the place * tag 'sched-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits) sched/fair: Trivial correction of the newidle_balance() comment sched/fair: Clear SMT siblings after determining the core is not idle sched: Fix kernel-doc markup x86: Print ratio freq_max/freq_base used in frequency invariance calculations x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC x86, sched: Calculate frequency invariance for AMD systems irq_work: Optimize irq_work_single() smp: Cleanup smp_call_function*() irq_work: Cleanup sched: Limit the amount of NUMA imbalance that can exist at fork time sched/numa: Allow a floating imbalance between NUMA nodes sched: Avoid unnecessary calculation of load imbalance at clone time sched/numa: Rename nr_running and break out the magic number sched: Make migrate_disable/enable() independent of RT sched/topology: Condition EAS enablement on FIE support arm64: Rebuild sched domains on invariance status changes sched/topology,schedutil: Wrap sched domains rebuild sched/uclamp: Allow to reset a task uclamp constraint value sched/core: Fix typos in comments Documentation: scheduler: fix information on arch SD flags, sched_domain and sched_debug ...
| * | Merge branch 'linus' into sched/core, to resolve semantic conflictIngo Molnar2020-11-274-16/+2
| |\ \ | | | | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | irq_work: CleanupPeter Zijlstra2020-11-241-4/+2
| | |/ | |/| | | | | | | | | | | | | | | | | | | Get rid of the __call_single_node union and clean up the API a little to avoid external code relying on the structure layout as much. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
* | | Merge tag 'timers-core-2020-12-14' of ↵Linus Torvalds2020-12-1413-251/+366
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timers and timekeeping updates from Thomas Gleixner: "Core: - Robustness improvements for the NOHZ tick management - Fixes and consolidation of the NTP/RTC synchronization code - Small fixes and improvements in various places - A set of function documentation udpates and fixes Drivers: - Cleanups and improvements in various clocksoure/event drivers - Removal of the EZChip NPS clocksource driver as the platfrom support was removed from ARC - The usual set of new device tree binding and json conversions - The RTC driver which have been acked by the RTC maintainer: * fix a long standing bug in the MC146818 library code which can cause reading garbage during the RTC internal update. * changes related to the NTP/RTC consolidation work" * tag 'timers-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) ntp: Fix prototype in the !CONFIG_GENERIC_CMOS_UPDATE case tick/sched: Make jiffies update quick check more robust ntp: Consolidate the RTC update implementation ntp: Make the RTC sync offset less obscure ntp, rtc: Move rtc_set_ntp_time() to ntp code ntp: Make the RTC synchronization more reliable rtc: core: Make the sync offset default more realistic rtc: cmos: Make rtc_cmos sync offset correct rtc: mc146818: Reduce spinlock section in mc146818_set_time() rtc: mc146818: Prevent reading garbage clocksource/drivers/sh_cmt: Fix potential deadlock when calling runtime PM clocksource/drivers/arm_arch_timer: Correct fault programming of CNTKCTL_EL1.EVNTI clocksource/drivers/arm_arch_timer: Use stable count reader in erratum sne clocksource/drivers/dw_apb_timer_of: Add error handling if no clock available clocksource/drivers/riscv: Make RISCV_TIMER depends on RISCV_SBI clocksource/drivers/ingenic: Fix section mismatch clocksource/drivers/cadence_ttc: Fix memory leak in ttc_setup_clockevent() dt-bindings: timer: renesas: tmu: Convert to json-schema dt-bindings: timer: renesas: tmu: Document r8a774e1 bindings clocksource/drivers/orion: Add missing clk_disable_unprepare() on error path ...
| * | | tick/sched: Make jiffies update quick check more robustThomas Gleixner2020-12-111-27/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The quick check in tick_do_update_jiffies64() whether jiffies need to be updated is not really correct under all circumstances and on all architectures, especially not on 32bit systems. The quick check does: if (now < READ_ONCE(tick_next_period)) return; and the counterpart in the update is: WRITE_ONCE(tick_next_period, next_update_time); This has two problems: 1) On weakly ordered architectures there is no guarantee that the stores before the WRITE_ONCE() are visible which means that other CPUs can operate on a stale jiffies value. 2) On 32bit the store of tick_next_period which is an u64 is split into two 32bit stores. If the first 32bit store advances tick_next_period far out and the second 32bit store is delayed (virt, NMI ...) then jiffies will become stale until the second 32bit store happens. Address this by seperating the handling for 32bit and 64bit. On 64bit problem #1 is addressed by replacing READ_ONCE() / WRITE_ONCE() with smp_load_acquire() / smp_store_release(). On 32bit problem #2 is addressed by protecting the quick check with the jiffies sequence counter. The load and stores can be plain because the sequence count mechanics provides the required barriers already. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/87czzpc02w.fsf@nanos.tec.linutronix.de
| * | | ntp: Consolidate the RTC update implementationThomas Gleixner2020-12-111-92/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code for the legacy RTC and the RTC class based update are pretty much the same. Consolidate the common parts into one function and just invoke the actual setter functions. For RTC class based devices the update code checks whether the offset is valid for the device, which is usually not the case for the first invocation. If it's not the same it stores the correct offset and lets the caller try again. That's not much different from the previous approach where the first invocation had a pretty low probability to actually hit the allowed window. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201206220542.355743355@linutronix.de
| * | | ntp: Make the RTC sync offset less obscureThomas Gleixner2020-12-111-23/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current RTC set_offset_nsec value is not really intuitive to understand. tsched twrite(t2.tv_sec - 1) t2 (seconds increment) The offset is calculated from twrite based on the assumption that t2 - twrite == 1s. That means for the MC146818 RTC the offset needs to be negative so that the write happens 500ms before t2. It's easier to understand when the whole calculation is based on t2. That avoids negative offsets and the meaning is obvious: t2 - twrite: The time defined by the chip when seconds increment after the write. twrite - tsched: The time for the transport to the point where the chip is updated. ==> set_offset_nsec = t2 - tsched ttransport = twrite - tsched tRTCinc = t2 - twrite ==> set_offset_nsec = ttransport + tRTCinc tRTCinc is a chip property and can be obtained from the data sheet. ttransport depends on how the RTC is connected. It is close to 0 for directly accessible RTCs. For RTCs behind a slow bus, e.g. i2c, it's the time required to send the update over the bus. This can be estimated or even calibrated, but that's a different problem. Adjust the implementation and update comments accordingly. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201206220542.263204937@linutronix.de
| * | | ntp, rtc: Move rtc_set_ntp_time() to ntp codeThomas Gleixner2020-12-111-3/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rtc_set_ntp_time() is not really RTC functionality as the code is just a user of RTC. Move it into the NTP code which allows further cleanups. Requested-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201206220542.166871172@linutronix.de
| * | | ntp: Make the RTC synchronization more reliableThomas Gleixner2020-12-112-42/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Miroslav reported that the periodic RTC synchronization in the NTP code fails more often than not to hit the specified update window. The reason is that the code uses delayed_work to schedule the update which needs to be in thread context as the underlying RTC might be connected via a slow bus, e.g. I2C. In the update function it verifies whether the current time is correct vs. the requirements of the underlying RTC. But delayed_work is using the timer wheel for scheduling which is inaccurate by design. Depending on the distance to the expiry the wheel gets less granular to allow batching and to avoid the cascading of the original timer wheel. See 500462a9de65 ("timers: Switch to a non-cascading wheel") and the code for further details. The code already deals with this by splitting the 660 seconds period into a long 659 seconds timer and then retrying with a smaller delta. But looking at the actual granularities of the timer wheel (which depend on the HZ configuration) the 659 seconds timer ends up in an outer wheel level and is affected by a worst case granularity of: HZ Granularity 1000 32s 250 16s 100 40s So the initial timer can be already off by max 12.5% which is not a big issue as the period of the sync is defined as ~11 minutes. The fine grained second attempt schedules to the desired update point with a timer expiring less than a second from now. Depending on the actual delta and the HZ setting even the second attempt can end up in outer wheel levels which have a large enough granularity to make the correctness check fail. As this is a fundamental property of the timer wheel there is no way to make this more accurate short of iterating in one jiffies steps towards the update point. Switch it to an hrtimer instead which schedules the actual update work. The hrtimer will expire precisely (max 1 jiffie delay when high resolution timers are not available). The actual scheduling delay of the work is the same as before. The update is triggered from do_adjtimex() which is a bit racy but not much more racy than it was before: if (ntp_synced()) queue_delayed_work(system_power_efficient_wq, &sync_work, 0); which is racy when the work is currently executed and has not managed to reschedule itself. This becomes now: if (ntp_synced() && !hrtimer_is_queued(&sync_hrtimer)) queue_work(system_power_efficient_wq, &sync_work, 0); which is racy when the hrtimer has expired and the work is currently executed and has not yet managed to rearm the hrtimer. Not a big problem as it just schedules work for nothing. The new implementation has a safe guard in place to catch the case where the hrtimer is queued on entry to the work function and avoids an extra update attempt of the RTC that way. Reported-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Miroslav Lichvar <mlichvar@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201206220542.062910520@linutronix.de
| * | | tick: Get rid of tick_periodThomas Gleixner2020-11-194-18/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The variable tick_period is initialized to NSEC_PER_TICK / HZ during boot and never updated again. If NSEC_PER_TICK is not an integer multiple of HZ this computation is less accurate than TICK_NSEC which has proper rounding in place. Aside of the inaccuracy there is no reason for having this variable at all. It's just a pointless indirection and all usage sites can just use the TICK_NSEC constant. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201117132006.766643526@linutronix.de
| * | | tick/sched: Release seqcount before invoking calc_load_global()Yunfeng Ye2020-11-191-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | calc_load_global() does not need the sequence count protection. [ tglx: Split it up properly and added comments ] Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201117132006.660902274@linutronix.de
| * | | tick/sched: Optimize tick_do_update_jiffies64() furtherThomas Gleixner2020-11-191-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that it's clear that there is always one tick to account, simplify the calculations some more. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201117132006.565663056@linutronix.de
| * | | tick/sched: Reduce seqcount held scope in tick_do_update_jiffies64()Yunfeng Ye2020-11-191-25/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If jiffies are up to date already (caller lost the race against another CPU) there is no point to change the sequence count. Doing that just forces other CPUs into the seqcount retry loop in tick_nohz_next_event() for nothing. Just bail out early. [ tglx: Rewrote most of it ] Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201117132006.462195901@linutronix.de
| * | | tick/sched: Use tick_next_period for lockless quick checkThomas Gleixner2020-11-191-13/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No point in doing calculations. tick_next_period = last_jiffies_update + tick_period Just check whether now is before tick_next_period to figure out whether jiffies need an update. Add a comment why the intentional data race in the quick check is safe or not so safe in a 32bit corner case and why we don't worry about it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201117132006.337366695@linutronix.de
| * | | tick: Document protections for tick related dataThomas Gleixner2020-11-192-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The protection rules for tick_next_period and last_jiffies_update are blury at best. Clarify this. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201117132006.197713794@linutronix.de
| * | | tick/broadcast: Serialize access to tick_next_periodThomas Gleixner2020-11-191-3/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tick_broadcast_setup_oneshot() accesses tick_next_period twice without any serialization. This is wrong in two aspects: - Reading it twice might make the broadcast data inconsistent if the variable is updated concurrently. - On 32bit systems the access might see an partial update Protect it with jiffies_lock. That's safe as none of the callchains leading up to this function can create a lock ordering violation: timer interrupt run_local_timers() hrtimer_run_queues() hrtimer_switch_to_hres() tick_init_highres() tick_switch_to_oneshot() tick_broadcast_switch_to_oneshot() or tick_check_oneshot_change() tick_nohz_switch_to_nohz() tick_switch_to_oneshot() tick_broadcast_switch_to_oneshot() Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201117132006.061341507@linutronix.de
| * | | hrtimer: Fix kernel-doc markupsMauro Carvalho Chehab2020-11-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hrtimer_get_remaining() markup is documenting, instead, __hrtimer_get_remaining(), as it is placed at the C file. In order to properly document it, a kernel-doc markup is needed together with the function prototype. So, add a new one, while preserving the existing one, just fixing the function name. The hrtimer_is_queued prototype has a typo: it is using '=' instead of '-' to split: identifier - description as required by kernel-doc markup. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/9dc87808c2fd07b7e050bafcd033c5ef05808fea.1605521731.git.mchehab+huawei@kernel.org
| * | | timers: Make run_local_timers() staticThomas Gleixner2020-11-161-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | No users outside of the timer code. Move the caller below this function to avoid a pointless forward declaration. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | timekeeping: Address parameter documentation issues for various functionsAlex Shi2020-11-151-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel-doc parser complains: kernel/time/timekeeping.c:1543: warning: Function parameter or member 'ts' not described in 'read_persistent_clock64' kernel/time/timekeeping.c:764: warning: Function parameter or member 'tk' not described in 'timekeeping_forward_now' kernel/time/timekeeping.c:1331: warning: Function parameter or member 'ts' not described in 'timekeeping_inject_offset' kernel/time/timekeeping.c:1331: warning: Excess function parameter 'tv' description in 'timekeeping_inject_offset' Add the missing parameter documentations and rename the 'tv' parameter of timekeeping_inject_offset() to 'ts' so it matches the implemention. [ tglx: Reworded a few docs and massaged changelog ] Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/1605252275-63652-5-git-send-email-alex.shi@linux.alibaba.com
| * | | timekeeping: Fix parameter docs of read_persistent_wall_and_boot_offset()Alex Shi2020-11-151-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Address the following kernel-doc markup warnings: kernel/time/timekeeping.c:1563: warning: Function parameter or member 'wall_time' not described in 'read_persistent_wall_and_boot_offset' kernel/time/timekeeping.c:1563: warning: Function parameter or member 'boot_offset' not described in 'read_persistent_wall_and_boot_offset' The parameters are described but miss the leading '@' and the colon after the parameter names. [ tglx: Massaged changelog ] Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/1605252275-63652-6-git-send-email-alex.shi@linux.alibaba.com
| * | | timekeeping: Add missing parameter docs for pvclock_gtod_[un]register_notifier()Alex Shi2020-11-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel-doc parser complains about: kernel/time/timekeeping.c:651: warning: Function parameter or member 'nb' not described in 'pvclock_gtod_register_notifier' kernel/time/timekeeping.c:670: warning: Function parameter or member 'nb' not described in 'pvclock_gtod_unregister_notifier' Add the missing parameter explanations. [ tglx: Massaged changelog ] Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/1605252275-63652-3-git-send-email-alex.shi@linux.alibaba.com
| * | | timekeeping: Fix up function documentation for the NMI safe accessorsThomas Gleixner2020-11-151-25/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alex reported the following warning: kernel/time/timekeeping.c:464: warning: Function parameter or member 'tkf' not described in '__ktime_get_fast_ns' which is not entirely correct because the documented function is ktime_get_mono_fast_ns() which does not have a parameter, but the kernel-doc parser looks at the function declaration which follows the comment and complains about the missing parameter documentation. Aside of that the documentation for the rest of the NMI safe accessors is either incomplete or missing. - Move the function documentation to the right place - Fixup the references and inconsistencies - Add the missing documentation for ktime_get_raw_fast_ns() Reported-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | timekeeping: Add missing parameter documentation for update_fast_timekeeper()Alex Shi2020-11-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Address the following warning: kernel/time/timekeeping.c:415: warning: Function parameter or member 'tkf' not described in 'update_fast_timekeeper' [ tglx: Remove the bogus ktime_get_mono_fast_ns() part ] Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/1605252275-63652-2-git-send-email-alex.shi@linux.alibaba.com
| * | | timekeeping: Remove static functions from kernel-doc markupAlex Shi2020-11-151-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various static functions in the timekeeping code have function comments which pretend to be kernel-doc, but are incomplete and trigger parser warnings. As these functions are local to the timekeeping core code there is no need to expose them via kernel-doc. Remove the double star kernel-doc marker and remove excess newlines. [ tglx: Massaged changelog and removed excess newlines ] Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/1605252275-63652-4-git-send-email-alex.shi@linux.alibaba.com
| * | | time: Add missing colons for parameter documentation of time64_to_tm()Alex Shi2020-11-151-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Address these kernel-doc warnings: kernel/time/timeconv.c:79: warning: Function parameter or member 'totalsecs' not described in 'time64_to_tm' kernel/time/timeconv.c:79: warning: Function parameter or member 'offset' not described in 'time64_to_tm' kernel/time/timeconv.c:79: warning: Function parameter or member 'result' not described in 'time64_to_tm' The parameters are described but lack colons after the parameter name. [ tglx: Massaged changelog ] Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/1605252275-63652-1-git-send-email-alex.shi@linux.alibaba.com
| * | | timers: Don't block on ->expiry_lock for TIMER_IRQSAFE timersSebastian Andrzej Siewior2020-11-151-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PREEMPT_RT does not spin and wait until a running timer completes its callback but instead it blocks on a sleeping lock to prevent a livelock in the case that the task waiting for the callback completion preempted the callback. This cannot be done for timers flagged with TIMER_IRQSAFE. These timers can be canceled from an interrupt disabled context even on RT kernels. The expiry callback of such timers is invoked with interrupts disabled so there is no need to use the expiry lock mechanism because obviously the callback cannot be preempted even on RT kernels. Do not use the timer_base::expiry_lock mechanism when waiting for a running callback to complete if the timer is flagged with TIMER_IRQSAFE. Also add a lockdep assertion for RT kernels to validate that the expiry lock mechanism is always invoked in preemptible context. Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201103190937.hga67rqhvknki3tp@linutronix.de
| * | | timer_list: Use printk format instead of open-coded symbol lookupHelge Deller2020-11-151-47/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the "%ps" printk format string to resolve symbol names. This works on all platforms, including ia64, ppc64 and parisc64 on which one needs to dereference pointers to function descriptors instead of function pointers. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201104163401.GA3984@ls3530.fritz.box
| * | | timekeeping: Convert jiffies_seq to seqcount_raw_spinlock_tDavidlohr Bueso2020-10-262-2/+3
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the new api and associate the seqcounter to the jiffies_lock enabling lockdep support - although for this particular case the write-side locking and non-preemptibility are quite obvious. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201021190749.19363-1-dave@stgolabs.net
* | | Merge tag 'fixes-v5.11' of ↵Linus Torvalds2020-12-141-6/+3
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull misc fixes from Christian Brauner: "This contains several fixes which felt worth being combined into a single branch: - Use put_nsproxy() instead of open-coding it switch_task_namespaces() - Kirill's work to unify lifecycle management for all namespaces. The lifetime counters are used identically for all namespaces types. Namespaces may of course have additional unrelated counters and these are not altered. This work allows us to unify the type of the counters and reduces maintenance cost by moving the counter in one place and indicating that basic lifetime management is identical for all namespaces. - Peilin's fix adding three byte padding to Dmitry's PTRACE_GET_SYSCALL_INFO uapi struct to prevent an info leak. - Two smal patches to convert from the /* fall through */ comment annotation to the fallthrough keyword annotation which I had taken into my branch and into -next before df561f6688fe ("treewide: Use fallthrough pseudo-keyword") made it upstream which fixed this tree-wide. Since I didn't want to invalidate all testing for other commits I didn't rebase and kept them" * tag 'fixes-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: nsproxy: use put_nsproxy() in switch_task_namespaces() sys: Convert to the new fallthrough notation signal: Convert to the new fallthrough notation time: Use generic ns_common::count cgroup: Use generic ns_common::count mnt: Use generic ns_common::count user: Use generic ns_common::count pid: Use generic ns_common::count ipc: Use generic ns_common::count uts: Use generic ns_common::count net: Use generic ns_common::count ns: Add a common refcount into ns_common ptrace: Prevent kernel-infoleak in ptrace_get_syscall_info()
| * | | time: Use generic ns_common::countKirill Tkhai2020-08-191-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch over time namespaces to use the newly introduced common lifetime counter. Currently every namespace type has its own lifetime counter which is stored in the specific namespace struct. The lifetime counters are used identically for all namespaces types. Namespaces may of course have additional unrelated counters and these are not altered. This introduces a common lifetime counter into struct ns_common. The ns_common struct encompasses information that all namespaces share. That should include the lifetime counter since its common for all of them. It also allows us to unify the type of the counters across all namespaces. Most of them use refcount_t but one uses atomic_t and at least one uses kref. Especially the last one doesn't make much sense since it's just a wrapper around refcount_t since 2016 and actually complicates cleanup operations by having to use container_of() to cast the correct namespace struct out of struct ns_common. Having the lifetime counter for the namespaces in one place reduces maintenance cost. Not just because after switching all namespaces over we will have removed more code than we added but also because the logic is more easily understandable and we indicate to the user that the basic lifetime requirements for all namespaces are currently identical. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/159644982033.604812.9406853013011123238.stgit@localhost.localdomain Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
* | | | namespace: make timens_on_fork() return nothingHui Su2020-11-181-4/+2
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | timens_on_fork() always return 0, and maybe not need to judge the return value in copy_namespaces(). So make timens_on_fork() return nothing and do not judge its return val in copy_namespaces(). Signed-off-by: Hui Su <sh_def@163.com> Link: https://lore.kernel.org/r/20201117161750.GA45121@rlk Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>