summaryrefslogtreecommitdiffstats
path: root/arch/s390/kernel/smp.c
Commit message (Collapse)AuthorAgeFilesLines
* s390/smp: use physical address for SIGP_SET_PREFIX commandAlexander Gordeev2022-03-271-1/+1
| | | | | | | | | | | | | Signal processor SIGP_SET_PREFIX command expects physical address of the lowcore to be installed, but instead the virtual address is provided. Note: this does not fix a bug currently, since virtual and physical addresses are identical. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/maccess: rework absolute lowcore accessorsAlexander Gordeev2022-03-271-6/+6
| | | | | | | | | | | | Macro mem_assign_absolute() is able to access the whole memory, but is only used and makes sense when updating the absolute lowcore. Instead, introduce get_abs_lowcore() and put_abs_lowcore() macros that limit access to absolute lowcore addresses only. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/smp: cleanup control register update routinesAlexander Gordeev2022-03-271-24/+12
| | | | | | | | Get rid of duplicate code and redundant data. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/smp: cleanup target CPU callback startingAlexander Gordeev2022-03-271-4/+11
| | | | | | | | | | | | | | | | Macro mem_assign_absolute() is used to initialize a target CPU lowcore callback parameters. But despite the macro name it writes to the absolute lowcore only if the target CPU is offline. In case the CPU is online the macro does implicitly write to the normal memory. That behaviour is correct, but extremely subtle. Sacrifice few program bits in favour of clarity and distinguish between online vs offline CPUs and normal vs absolute lowcore pointer. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390: raise minimum supported machine generation to z10Vasily Gorbik2022-03-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | Machine generations up to z9 (released in May 2006) have been officially out of service for several years now (z9 end of service - January 31, 2019). No distributions build kernels supporting those old machine generations anymore, except Debian, which seems to pick the oldest supported generation. The team supporting Debian on s390 has been notified about the change. Raising minimum supported machine generation to z10 helps to reduce maintenance cost and effectively remove code, which is not getting enough testing coverage due to lack of older hardware and distributions support. Besides that this unblocks some optimization opportunities and allows to use wider instruction set in asm files for future features implementation. Due to this change spectre mitigation and usercopy implementations could be drastically simplified and many newer instructions could be converted from ".insn" encoding to instruction names. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/smp: sort out physical vs virtual pointers usageAlexander Gordeev2022-03-011-3/+3
| | | | | | | | | | | | | | With commit 5789284710aa ("s390/smp: reallocate IPL CPU lowcore") virtual addresses are wrongly passed to memblock_free_late() and SPX instructions on IPL CPU reinitialization. Note: this does not fix a bug currently, since virtual and physical addresses are identical. Fixes: 5789284710aa ("s390/smp: reallocate IPL CPU lowcore") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/maccess: fix semantics of memcpy_real() and its callersAlexander Gordeev2022-02-091-1/+1
| | | | | | | | | | | | | | | | | | | | | There is a confusion with regard to the source address of memcpy_real() and calling functions. While the declared type for a source assumes a virtual address, in fact it always called with physical address of the source. This confusion led to bugs in copy_oldmem_kernel() and copy_oldmem_user() functions, where __pa() macro applied mistakenly to physical addresses. It does not lead to a real issue, since virtual and physical addresses are currently the same. Fix both the bugs and memcpy_real() prototype by making type of source address consistent to the function name and the way it actually used. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390: remove invalid email address of Heiko CarstensHeiko Carstens2022-02-061-1/+0
| | | | | | | | | | | Remove my old invalid email address which can be found in a couple of files. Instead of updating it, just remove my contact data completely from source files. We have git and other tools which allow to figure out who is responsible for what with recent contact data. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/crash_dump: fix virtual vs physical address handlingHeiko Carstens2021-12-201-11/+7
| | | | | | | | | | | | | | | Signal processor STORE STATUS requires a physical address where register contents are supposed to be written to, however the kernel must read the data via the corresponding virtual address. Also the allocated save_area, where register contents are copied to, resides in virtual address space. Fix this by using proper __pa() conversion, or correct memblock_alloc() invocation. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/nmi: disable interrupts on extended save area updateAlexander Gordeev2021-12-161-4/+9
| | | | | | | | | | | Updating of the pointer to machine check extended save area on the IPL CPU needs the lowcore protection to be disabled. Disable interrupts while the protection is off to avoid unnoticed writes to the lowcore. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/smp: fix memblock_phys_free() vs memblock_free() confusionHeiko Carstens2021-12-161-1/+1
| | | | | | | | | | | memblock_phys_free() is used on a virtual address. Fix this by using memblock_free(). Note: this doesn't fix a bug currently, since virtual and physical addresses are identical. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/nmi: add missing __pa/__va address conversion of extended save areaHeiko Carstens2021-12-061-1/+1
| | | | | | | | | | | | Add missing __pa/__va address conversion of machine check extended save area designation, which is an absolute address. Note: this currently doesn't fix a real bug, since virtual addresses are indentical to physical ones. Reported-by: Vineeth Vijayan <vneethv@linux.ibm.com> Tested-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* memblock: rename memblock_free to memblock_phys_freeMike Rapoport2021-11-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Since memblock_free() operates on a physical range, make its name reflect it and rename it to memblock_phys_free(), so it will be a logical counterpart to memblock_phys_alloc(). The callers are updated with the below semantic patch: @@ expression addr; expression size; @@ - memblock_free(addr, size); + memblock_phys_free(addr, size); Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Juergen Gross <jgross@suse.com> Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* memblock: drop memblock_free_early_nid() and memblock_free_early()Mike Rapoport2021-11-061-1/+1
| | | | | | | | | | | | | | | | memblock_free_early_nid() is unused and memblock_free_early() is an alias for memblock_free(). Replace calls to memblock_free_early() with calls to memblock_free() and remove memblock_free_early() and memblock_free_early_nid(). Link: https://lkml.kernel.org/r/20210930185031.18648-4-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Juergen Gross <jgross@suse.com> Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* s390/topology: fix topology information when calling cpu hotplug notifiersSven Schnelle2021-09-071-2/+7
| | | | | | | | | | | | | | | | | | | | | The cpu hotplug notifiers are called without updating the core/thread masks when a new CPU is added. This causes problems with code setting up data structures in a cpu hotplug notifier, and relying on that later in normal code. This caused a crash in the new core scheduling code (SCHED_CORE), where rq->core was set up in a notifier depending on cpu masks. To fix this, add a cpu_setup_mask which is used in update_cpu_masks() instead of the cpu_online_mask to determine whether the cpu masks should be set for a certain cpu. Also move update_cpu_masks() to update the masks before calling notify_cpu_starting() so that the notifiers are seeing the updated masks. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Cc: <stable@vger.kernel.org> [hca@linux.ibm.com: get rid of cpu_online_mask handling] Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/smp: do not use nodat_stack for secondary CPU startAlexander Gordeev2021-08-261-15/+10
| | | | | | | | | | | | | | | | | The secondary CPU start C routine uses nodat_stack as a interim stack before finally switching to kernel_stack. Such scheme is superfluous, since the assembler restart interrupt handler (that secondary CPU starter is called from) does not need to use any stack for switching into DAT mode. Once DAT is on, any stack including virtually- mapped one could be used. Avoid the use of nodat_stack and smp_start_secondary() helper. Instead, initiate kernel_stack directly from the restart interrupt handler. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/smp: enable DAT before CPU restart callback is calledAlexander Gordeev2021-08-261-9/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The restart interrupt is triggered whenever a secondary CPU is brought online, a remote function call dispatched from another CPU or a manual PSW restart is initiated and causes the system to kdump. The handling routine is always called with DAT turned off. It then initializes the stack frame and invokes a callback. The existing callbacks handle DAT as follows: * __do_restart() and __machine_kexec() turn in on upon entry; * __ipl_run(), __reipl_run() and __dump_run() do not turn it right away, but all of them call diag308() - which turns DAT on, but only if kasan is enabled; In addition to the described complexity all callbacks (and the functions they call) should avoid kasan instrumentation while DAT is off. This update enables DAT in the assembler restart handler and relieves any callbacks (which are mostly C functions) from dealing with DAT altogether. There are four types of CPU restart that initialize control registers in different ways: 1. Start of secondary CPU on boot - control registers are inherited from the IPL CPU; 2. Restart of online CPU - control registers of the CPU being restarted are kept; 3. Hotplug of offline CPU - control registers are inherited from the starting CPU; 4. Start of offline CPU triggered by manual PSW restart - the control registers are read from the absolute lowcore and contain the boot time IPL CPU values updated with all follow-up calls of smp_ctl_set_bit() and smp_ctl_clear_bit() routines; In first three cases contents of the control registers is the most recent. In the latter case control registers are good enough to facilitate successful completion of kdump operation. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390: rename dma section to amode31Heiko Carstens2021-08-051-1/+1
| | | | | | | | | | | | | | | | | | | The dma section name is confusing, since the code which resides within that section has nothing to do with direct memory access. Instead the limitation is that the code has to run in 31 bit addressing mode, and therefore has to reside below 2GB. So the name was chosen since ZONE_DMA is the same region. To reduce confusion rename the section to amode31, which hopefully describes better what this is about. Note: this will also change vmcoreinfo strings - SDMA=... gets renamed to SAMODE31=... - EDMA=... gets renamed to EAMODE31=... Acked-by: Vasily Gorbik <gor@linux.ibm.com> Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390: replace deprecated CPU-hotplug functionsSebastian Andrzej Siewior2021-08-051-4/+4
| | | | | | | | | | | | | The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20210803141621.780504-5-bigeasy@linutronix.de Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/dump: introduce boot data 'oldmem_data'Alexander Egorenkov2021-07-271-2/+2
| | | | | | | | | | | | The new boot data struct shall replace global variables OLDMEM_BASE and OLDMEM_SIZE. It is initialized in the decompressor and passed to the decompressed kernel. In comparison to the old solution, this one doesn't access data at fixed physical addresses which will become important when the decompressor becomes relocatable. Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390: preempt: Fix preempt_count initializationValentin Schneider2021-07-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | S390's init_idle_preempt_count(p, cpu) doesn't actually let us initialize the preempt_count of the requested CPU's idle task: it unconditionally writes to the current CPU's. This clearly conflicts with idle_threads_init(), which intends to initialize *all* the idle tasks, including their preempt_count (or their CPU's, if the arch uses a per-CPU preempt_count). Unfortunately, it seems the way s390 does things doesn't let us initialize every possible CPU's preempt_count early on, as the pages where this resides are only allocated when a CPU is brought up and are freed when it is brought down. Let the arch-specific code set a CPU's preempt_count when its lowcore is allocated, and turn init_idle_preempt_count() into an empty stub. Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20210707163338.1623014-1-valentin.schneider@arm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390: rename CALL_ON_STACK_NORETURN() to call_on_stack_noreturn()Heiko Carstens2021-07-081-1/+1
| | | | | | | | Lower case matches the call_on_stack() macro and is easier to read. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/smp: use call_on_stack() macroHeiko Carstens2021-07-081-4/+8
| | | | | | Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* Merge tag 's390-5.14-1' of ↵Linus Torvalds2021-07-041-44/+87
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Rework inline asm to get rid of error prone "register asm" constructs, which are problematic especially when code instrumentation is enabled. In particular introduce and use register pair union to allocate even/odd register pairs. Unfortunately this breaks compatibility with older clang compilers and minimum clang version for s390 has been raised to 13. https://lore.kernel.org/linux-next/CAK7LNARuSmPCEy-ak0erPrPTgZdGVypBROFhtw+=3spoGoYsyw@mail.gmail.com/ - Fix gcc 11 warnings, which triggered various minor reworks all over the code. - Add zstd kernel image compression support. - Rework boot CPU lowcore handling. - De-duplicate and move kernel memory layout setup logic earlier. - Few fixes in preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for mem functions. - Remove broken and unused power management support leftovers in s390 drivers. - Disable stack-protector for decompressor and purgatory to fix buildroot build. - Fix vt220 sclp console name to match the char device name. - Enable HAVE_IOREMAP_PROT and add zpci_set_irq()/zpci_clear_irq() in zPCI code. - Remove some implausible WARN_ON_ONCEs and remove arch specific counter transaction call backs in favour of default transaction handling in perf code. - Extend/add new uevents for online/config/mode state changes of AP card / queue device in zcrypt. - Minor entry and ccwgroup code improvements. - Other small various fixes and improvements all over the code. * tag 's390-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (91 commits) s390/dasd: use register pair instead of register asm s390/qdio: get rid of register asm s390/ioasm: use symbolic names for asm operands s390/ioasm: get rid of register asm s390/cmf: get rid of register asm s390/lib,string: get rid of register asm s390/lib,uaccess: get rid of register asm s390/string: get rid of register asm s390/cmpxchg: use register pair instead of register asm s390/mm,pages-states: get rid of register asm s390/lib,xor: get rid of register asm s390/timex: get rid of register asm s390/hypfs: use register pair instead of register asm s390/zcrypt: Switch to flexible array member s390/speculation: Use statically initialized const for instructions virtio/s390: get rid of open-coded kvm hypercall s390/pci: add zpci_set_irq()/zpci_clear_irq() scripts/min-tool-version.sh: Raise minimum clang version to 13.0.0 for s390 s390/ipl: use register pair instead of register asm s390/mem_detect: fix tprot() program check new psw handling ...
| * s390/smp: use register pair instead of register asmHeiko Carstens2021-06-181-11/+11
| | | | | | | | | | Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
| * s390/smp: remove redundant pcpu::lowcore memberAlexander Gordeev2021-06-071-15/+21
| | | | | | | | | | | | | | | | | | | | Per-CPU pointer to lowcore is stored in global lowcore_ptr[] array and duplicated in struct pcpu::lowcore member. This update removes the redundancy. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
| * s390/smp: do not preserve boot CPU lowcore on hotplugAlexander Gordeev2021-06-071-29/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | Once the kernel is running the boot CPU lowcore becomes freeable and does not differ from the secondary CPU ones in any way. Make use of it and do not preserve the boot CPU lowcore on unplugging. That allows returning unused memory when the boot CPU is offline and makes the code more clear. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
| * s390/smp: reallocate IPL CPU lowcoreAlexander Gordeev2021-06-071-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The lowcore for IPL CPU is special. It is allocated early in the boot process using memblock and never freed since. The reason is pcpu_alloc_lowcore() and pcpu_free_lowcore() routines use page allocator which is not available when the IPL CPU is getting initialized. Similar problem is already addressed for stacks - once the virtual memory is available the early boot stacks get re- allocated. Doing the same for lowcore will allow freeing the IPL CPU lowcore and make no difference between the boot and secondary CPUs. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
| * s390/facilities: move stfl information from lowcore to global dataSven Schnelle2021-06-071-4/+0
| | | | | | | | | | | | | | | | | | | | With gcc-11, there are a lot of warnings because the facility functions are accessing lowcore through a null pointer. Fix this by moving the facility arrays away from lowcore. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* | sched/core: Initialize the idle task with preemption disabledValentin Schneider2021-05-121-1/+0
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As pointed out by commit de9b8f5dcbd9 ("sched: Fix crash trying to dequeue/enqueue the idle thread") init_idle() can and will be invoked more than once on the same idle task. At boot time, it is invoked for the boot CPU thread by sched_init(). Then smp_init() creates the threads for all the secondary CPUs and invokes init_idle() on them. As the hotplug machinery brings the secondaries to life, it will issue calls to idle_thread_get(), which itself invokes init_idle() yet again. In this case it's invoked twice more per secondary: at _cpu_up(), and at bringup_cpu(). Given smp_init() already initializes the idle tasks for all *possible* CPUs, no further initialization should be required. Now, removing init_idle() from idle_thread_get() exposes some interesting expectations with regards to the idle task's preempt_count: the secondary startup always issues a preempt_disable(), requiring some reset of the preempt count to 0 between hot-unplug and hotplug, which is currently served by idle_thread_get() -> idle_init(). Given the idle task is supposed to have preemption disabled once and never see it re-enabled, it seems that what we actually want is to initialize its preempt_count to PREEMPT_DISABLED and leave it there. Do that, and remove init_idle() from idle_thread_get(). Secondary startups were patched via coccinelle: @begone@ @@ -preempt_disable(); ... cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210512094636.2958515-1-valentin.schneider@arm.com
* KVM: s390: diag9c (directed yield) forwardingPierre Morel2021-03-091-0/+1
| | | | | | | | | | | | | | | | | | | When we intercept a DIAG_9C from the guest we verify that the target real CPU associated with the virtual CPU designated by the guest is running and if not we forward the DIAG_9C to the target real CPU. To avoid a diag9c storm we allow a maximal rate of diag9c forwarding. The rate is calculated as a count per second defined as a new parameter of the s390 kvm module: diag9c_forwarding_hz . The default value of 0 is to not forward diag9c. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Link: https://lore.kernel.org/r/1613997661-22525-2-git-send-email-pmorel@linux.ibm.com Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* s390/smp: implement arch_irq_work_raise()Ilya Leoshkevich2021-02-241-0/+11
| | | | | | | | | | | | | The immediate need to have this is to have bpf_send_signal() send the signal ASAP instead of during the next hrtimer interrupt. However, it should also improve irq_work_queue() latencies in general, as well as get s390 out of the lame architectures list [1]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/irq_work.c?h=v5.11#n45 Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/smp: smp_emergency_stop() - move cpumask away from stackHeiko Carstens2021-02-241-1/+4
| | | | | | | | | | | | | Make "cpumask_t cpumask" a static variable to avoid a potential large stack frame. Also protect against potential concurrent callers by introducing a local lock. Note: smp_emergency_stop() gets only called with irqs and machine checks disabled, therefore a cpu local deadlock is not possible. For concurrent callers the first cpu which enters the critical section wins and will stop all other cpus. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/smp: __smp_rescan_cpus() - move cpumask away from stackHeiko Carstens2021-02-241-1/+1
| | | | | | | | | Avoid a potentially large stack frame and overflow by making "cpumask_t avail" a static variable. There is no concurrent access due to the existing locking. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/smp: consolidate locking for smp_rescan()Heiko Carstens2021-02-241-6/+4
| | | | | | | | Move locking to __smp_rescan() instead of duplicating it to all call sites. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390: add stack for machine check handlerSven Schnelle2021-02-131-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous code used the normal kernel stack for machine checks. This is problematic when a machine check interrupts a system call or interrupt handler right at the beginning where registers are set up. Assume system_call is interrupted at the first instruction and a machine check is triggered. The machine check handler is called, checks the PSW to see whether it is coming from user space, notices that it is already in kernel mode but %r15 still contains the user space stack. This would lead to a kernel crash. There are basically two ways of fixing that: Either using the 'critical cleanup' approach which compares the address in the PSW to see whether it is already at a point where the stack has been set up, or use an extra stack for the machine check handler. For simplicity, we will go with the second approach and allocate an extra stack. This adds some memory overhead for large systems, but usually large system have plenty of memory so this isn't really a concern. But it keeps the mchk stack setup simple and less error prone. Fixes: 0b0ed657fe00 ("s390: remove critical section cleanup from entry.S") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Cc: <stable@kernel.org> # v5.8+ Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390: convert to generic entrySven Schnelle2021-01-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch converts s390 to use the generic entry infrastructure from kernel/entry/*. There are a few special things on s390: - PIF_PER_TRAP is moved to TIF_PER_TRAP as the generic code doesn't know about our PIF flags in exit_to_user_mode_loop(). - The old code had several ways to restart syscalls: a) PIF_SYSCALL_RESTART, which was only set during execve to force a restart after upgrading a process (usually qemu-kvm) to pgste page table extensions. b) PIF_SYSCALL, which is set by do_signal() to indicate that the current syscall should be restarted. This is changed so that do_signal() now also uses PIF_SYSCALL_RESTART. Continuing to use PIF_SYSCALL doesn't work with the generic code, and changing it to PIF_SYSCALL_RESTART makes PIF_SYSCALL and PIF_SYSCALL_RESTART more unique. - On s390 calling sys_sigreturn or sys_rt_sigreturn is implemented by executing a svc instruction on the process stack which causes a fault. While handling that fault the fault code sets PIF_SYSCALL to hand over processing to the syscall code on exit to usermode. The patch introduces PIF_SYSCALL_RET_SET, which is set if ptrace sets a return value for a syscall. The s390x ptrace ABI uses r2 both for the syscall number and return value, so ptrace cannot set the syscall number + return value at the same time. The flag makes handling that a bit easier. do_syscall() will just skip executing the syscall if PIF_SYSCALL_RET_SET is set. CONFIG_DEBUG_ASCE was removd in favour of the generic CONFIG_DEBUG_ENTRY. CR1/7/13 will be checked both on kernel entry and exit to contain the correct asces. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/smp: perform initial CPU reset also for SMT siblingsSven Schnelle2020-12-091-15/+3
| | | | | | | | | | | | | Not resetting the SMT siblings might leave them in unpredictable state. One of the observed problems was that the CPU timer wasn't reset and therefore large system time values where accounted during CPU bringup. Cc: <stable@kernel.org> # 4.0 Fixes: 10ad34bc76dfb ("s390: add SMT support") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/vdso: reimplement getcpu vdso syscallHeiko Carstens2020-11-231-0/+2
| | | | | | | | Implement the previously removed getcpu vdso syscall by using the TOD programmable field to pass the cpu number to user space. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/mm: use invalid asce instead of kernel asceHeiko Carstens2020-11-231-1/+1
| | | | | | | | | | | | | | | Create a region 3 page table which contains only invalid entries, and use that via "s390_invalid_asce" instead of the kernel ASCE whenever there is either - no user address space available, e.g. during early startup - as an intermediate ASCE when address spaces are switched This makes sure that user space accesses in such situations are guaranteed to fail. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/mm: remove set_fs / rework address space handlingHeiko Carstens2020-11-231-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove set_fs support from s390. With doing this rework address space handling and simplify it. As a result address spaces are now setup like this: CPU running in | %cr1 ASCE | %cr7 ASCE | %cr13 ASCE ----------------------------|-----------|-----------|----------- user space | user | user | kernel kernel, normal execution | kernel | user | kernel kernel, kvm guest execution | gmap | user | kernel To achieve this the getcpu vdso syscall is removed in order to avoid secondary address mode and a separate vdso address space in for user space. The getcpu vdso syscall will be implemented differently with a subsequent patch. The kernel accesses user space always via secondary address space. This happens in different ways: - with mvcos in home space mode and directly read/write to secondary address space - with mvcs/mvcp in primary space mode and copy from primary space to secondary space or vice versa - with e.g. cs in secondary space mode and access secondary space Switching translation modes happens with sacf before and after instructions which access user space, like before. Lazy handling of control register reloading is removed in the hope to make everything simpler, but at the cost of making kernel entry and exit a bit slower. That is: on kernel entry the primary asce is always changed to contain the kernel asce, and on kernel exit the primary asce is changed again so it contains the user asce. In kernel mode there is only one exception to the primary asce: when kvm guests are executed the primary asce contains the gmap asce (which describes the guest address space). The primary asce is reset to kernel asce whenever kvm guest execution is interrupted, so that this doesn't has to be taken into account for any user space accesses. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/smp: move rcu_cpu_starting() earlierQian Cai2020-11-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The call to rcu_cpu_starting() in smp_init_secondary() is not early enough in the CPU-hotplug onlining process, which results in lockdep splats as follows: WARNING: suspicious RCU usage ----------------------------- kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader section!! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/1/0. Call Trace: show_stack+0x158/0x1f0 dump_stack+0x1f2/0x238 __lock_acquire+0x2640/0x4dd0 lock_acquire+0x3a8/0xd08 _raw_spin_lock_irqsave+0xc0/0xf0 clockevents_register_device+0xa8/0x528 init_cpu_timer+0x33e/0x468 smp_init_secondary+0x11a/0x328 smp_start_secondary+0x82/0x88 This is avoided by moving the call to rcu_cpu_starting up near the beginning of the smp_init_secondary() function. Note that the raw_smp_processor_id() is required in order to avoid calling into lockdep before RCU has declared the CPU to be watched for readers. Link: https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/ Signed-off-by: Qian Cai <cai@redhat.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/nvme: support firmware-assisted dump to NVMe disksAlexander Egorenkov2020-10-021-6/+6
| | | | | | | | | | | From the kernel perspective NVMe dump works exactly like zFCP dump. Therefore, adapt all places where code explicitly tests only for IPL of type FCP DUMP. And also set the memory end correctly in this case. Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Reviewed-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* Merge tag 'uninit-macro-v5.9-rc1' of ↵Linus Torvalds2020-08-041-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull uninitialized_var() macro removal from Kees Cook: "This is long overdue, and has hidden too many bugs over the years. The series has several "by hand" fixes, and then a trivial treewide replacement. - Clean up non-trivial uses of uninitialized_var() - Update documentation and checkpatch for uninitialized_var() removal - Treewide removal of uninitialized_var()" * tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: compiler: Remove uninitialized_var() macro treewide: Remove uninitialized_var() usage checkpatch: Remove awareness of uninitialized_var() macro mm/debug_vm_pgtable: Remove uninitialized_var() usage f2fs: Eliminate usage of uninitialized_var() macro media: sur40: Remove uninitialized_var() usage KVM: PPC: Book3S PR: Remove uninitialized_var() usage clk: spear: Remove uninitialized_var() usage clk: st: Remove uninitialized_var() usage spi: davinci: Remove uninitialized_var() usage ide: Remove uninitialized_var() usage rtlwifi: rtl8192cu: Remove uninitialized_var() usage b43: Remove uninitialized_var() usage drbd: Remove uninitialized_var() usage x86/mm/numa: Remove uninitialized_var() usage docs: deprecated.rst: Add uninitialized_var()
| * treewide: Remove uninitialized_var() usageKees Cook2020-07-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using uninitialized_var() is dangerous as it papers over real bugs[1] (or can in the future), and suppresses unrelated compiler warnings (e.g. "unused variable"). If the compiler thinks it is uninitialized, either simply initialize the variable or make compiler changes. In preparation for removing[2] the[3] macro[4], remove all remaining needless uses with the following script: git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \ xargs perl -pi -e \ 's/\buninitialized_var\(([^\)]+)\)/\1/g; s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;' drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid pathological white-space. No outstanding warnings were found building allmodconfig with GCC 9.3.0 for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64, alpha, and m68k. [1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/ [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/ [3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/ [4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/ Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5 Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs Signed-off-by: Kees Cook <keescook@chromium.org>
* | s390/smp: add missing linebreakHeiko Carstens2020-07-031-0/+1
| | | | | | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* | s390/smp: move smp_cpus_done() to header fileHeiko Carstens2020-07-031-4/+0
|/ | | | | | Saves us a couple of bytes. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* Merge tag 's390-5.8-1' of ↵Linus Torvalds2020-06-081-0/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Add support for multi-function devices in pci code. - Enable PF-VF linking for architectures using the pdev->no_vf_scan flag (currently just s390). - Add reipl from NVMe support. - Get rid of critical section cleanup in entry.S. - Refactor PNSO CHSC (perform network subchannel operation) in cio and qeth. - QDIO interrupts and error handling fixes and improvements, more refactoring changes. - Align ioremap() with generic code. - Accept requests without the prefetch bit set in vfio-ccw. - Enable path handling via two new regions in vfio-ccw. - Other small fixes and improvements all over the code. * tag 's390-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (52 commits) vfio-ccw: make vfio_ccw_regops variables declarations static vfio-ccw: Add trace for CRW event vfio-ccw: Wire up the CRW irq and CRW region vfio-ccw: Introduce a new CRW region vfio-ccw: Refactor IRQ handlers vfio-ccw: Introduce a new schib region vfio-ccw: Refactor the unregister of the async regions vfio-ccw: Register a chp_event callback for vfio-ccw vfio-ccw: Introduce new helper functions to free/destroy regions vfio-ccw: document possible errors vfio-ccw: Enable transparent CCW IPL from DASD s390/pci: Log new handle in clp_disable_fh() s390/cio, s390/qeth: cleanup PNSO CHSC s390/qdio: remove q->first_to_kick s390/qdio: fix up qdio_start_irq() kerneldoc s390: remove critical section cleanup from entry.S s390: add machine check SIGP s390/pci: ioremap() align with generic code s390/ap: introduce new ap function ap_get_qdev() Documentation/s390: Update / remove developerWorks web links ...
| * s390: add machine check SIGPSven Schnelle2020-05-281-0/+8
| | | | | | | | | | | | | | | | | | | | This will be used with the upcoming entry.S changes to signal that there's a machine check pending that cannot be handled in the Machine check handler itself. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* | s390/ftrace: fix potential crashes when switching tracersPhilipp Rudo2020-04-221-2/+2
|/ | | | | | | | | | | | | | Switching tracers include instruction patching. To prevent that a instruction is patched while it's read the instruction patching is done in stop_machine 'context'. This also means that any function called during stop_machine must not be traced. Thus add 'notrace' to all functions called within stop_machine. Fixes: 1ec2772e0c3c ("s390/diag: add a statistic for diagnose calls") Fixes: 38f2c691a4b3 ("s390: improve wait logic of stop_machine") Fixes: 4ecf0a43e729 ("processor: get rid of cpu_relax_yield") Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>