summaryrefslogtreecommitdiffstats
path: root/arch/s390/kernel
Commit message (Collapse)AuthorAgeFilesLines
* s390/module: fix loading modules with a lot of relocationsIlya Leoshkevich2022-01-241-19/+18
| | | | | | | | | | | | | | | | | If the size of the PLT entries generated by apply_rela() exceeds 64KiB, the first ones can no longer reach __jump_r1 with brc. Fix by using brcl. An alternative solution is to add a __jump_r1 copy after every 64KiB, however, the space savings are quite small and do not justify the additional complexity. Fixes: f19fbd5ed642 ("s390: introduce execute-trampolines for branches") Cc: stable@vger.kernel.org Reported-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/nmi: handle vector validity failures for KVM guestsChristian Borntraeger2022-01-231-1/+8
| | | | | | | | | | | | | The machine check validity bit tells about the context. If a KVM guest was running the bit tells about the guest validity and the host state is not affected. As a guest can disable the guest validity this might result in unwanted host errors on machine checks. Cc: stable@vger.kernel.org Fixes: c929500d7a5a ("s390/nmi: s390: New low level handling for machine check happening in guest") Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/nmi: handle guarded storage validity failures for KVM guestsChristian Borntraeger2022-01-231-4/+14
| | | | | | | | | | | | | | | | | | machine check validity bits reflect the state of the machine check. If a guest does not make use of guarded storage, the validity bit might be off. We can not use the host CR bit to decide if the validity bit must be on. So ignore "invalid" guarded storage controls for KVM guests in the host and rely on the machine check being forwarded to the guest. If no other errors happen from a host perspective everything is fine and no process must be killed and the host can continue to run. Cc: stable@vger.kernel.org Fixes: c929500d7a5a ("s390/nmi: s390: New low level handling for machine check happening in guest") Reported-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com> Tested-by: Carsten Otte <cotte@de.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* Merge tag 's390-5.17-2' of ↵Linus Torvalds2022-01-213-6/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: - add Sven Schnelle as reviewer for s390 code - make uaccess code more readable - change cpu measurement facility code to also support counter second version number 7, and add discard support for limited samples * tag 's390-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: add Sven Schnelle as reviewer s390/uaccess: introduce bit field for OAC specifier s390/cpumf: Support for CPU Measurement Sampling Facility LS bit s390/cpumf: Support for CPU Measurement Facility CSVN 7
| * s390/cpumf: Support for CPU Measurement Sampling Facility LS bitThomas Richter2022-01-171-1/+1
| | | | | | | | | | | | | | | | | | | | Adds support for the CPU Measurement Sampling Facility limit sampling bit in the sampling device driver. Limited samples have no valueable information are not collected. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * s390/cpumf: Support for CPU Measurement Facility CSVN 7Thomas Richter2022-01-172-5/+5
| | | | | | | | | | | | | | | | | | Adds support for the CPU Measurement Counter Facility second version number 7. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* | Merge branch 'signal-for-v5.17' of ↵Linus Torvalds2022-01-172-2/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull signal/exit/ptrace updates from Eric Biederman: "This set of changes deletes some dead code, makes a lot of cleanups which hopefully make the code easier to follow, and fixes bugs found along the way. The end-game which I have not yet reached yet is for fatal signals that generate coredumps to be short-circuit deliverable from complete_signal, for force_siginfo_to_task not to require changing userspace configured signal delivery state, and for the ptrace stops to always happen in locations where we can guarantee on all architectures that the all of the registers are saved and available on the stack. Removal of profile_task_ext, profile_munmap, and profile_handoff_task are the big successes for dead code removal this round. A bunch of small bug fixes are included, as most of the issues reported were small enough that they would not affect bisection so I simply added the fixes and did not fold the fixes into the changes they were fixing. There was a bug that broke coredumps piped to systemd-coredump. I dropped the change that caused that bug and replaced it entirely with something much more restrained. Unfortunately that required some rebasing. Some successes after this set of changes: There are few enough calls to do_exit to audit in a reasonable amount of time. The lifetime of struct kthread now matches the lifetime of struct task, and the pointer to struct kthread is no longer stored in set_child_tid. The flag SIGNAL_GROUP_COREDUMP is removed. The field group_exit_task is removed. Issues where task->exit_code was examined with signal->group_exit_code should been examined were fixed. There are several loosely related changes included because I am cleaning up and if I don't include them they will probably get lost. The original postings of these changes can be found at: https://lkml.kernel.org/r/87a6ha4zsd.fsf@email.froward.int.ebiederm.org https://lkml.kernel.org/r/87bl1kunjj.fsf@email.froward.int.ebiederm.org https://lkml.kernel.org/r/87r19opkx1.fsf_-_@email.froward.int.ebiederm.org I trimmed back the last set of changes to only the obviously correct once. Simply because there was less time for review than I had hoped" * 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (44 commits) ptrace/m68k: Stop open coding ptrace_report_syscall ptrace: Remove unused regs argument from ptrace_report_syscall ptrace: Remove second setting of PT_SEIZED in ptrace_attach taskstats: Cleanup the use of task->exit_code exit: Use the correct exit_code in /proc/<pid>/stat exit: Fix the exit_code for wait_task_zombie exit: Coredumps reach do_group_exit exit: Remove profile_handoff_task exit: Remove profile_task_exit & profile_munmap signal: clean up kernel-doc comments signal: Remove the helper signal_group_exit signal: Rename group_exit_task group_exec_task coredump: Stop setting signal->group_exit_task signal: Remove SIGNAL_GROUP_COREDUMP signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process signal: Make coredump handling explicit in complete_signal signal: Have prepare_signal detect coredumps using signal->core_state signal: Have the oom killer detect coredumps using signal->core_state exit: Move force_uaccess back into do_exit exit: Guarantee make_task_dead leaks the tsk when calling do_task_exit ...
| * | exit: Add and use make_task_dead.Eric W. Biederman2021-12-132-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two big uses of do_exit. The first is it's design use to be the guts of the exit(2) system call. The second use is to terminate a task after something catastrophic has happened like a NULL pointer in kernel code. Add a function make_task_dead that is initialy exactly the same as do_exit to cover the cases where do_exit is called to handle catastrophic failure. In time this can probably be reduced to just a light wrapper around do_task_dead. For now keep it exactly the same so that there will be no behavioral differences introducing this new concept. Replace all of the uses of do_exit that use it for catastraphic task cleanup with make_task_dead to make it clear what the code is doing. As part of this rename rewind_stack_do_exit rewind_stack_and_make_dead. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * | exit/s390: Remove dead reference to do_exit from copy_threadEric W. Biederman2021-12-131-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | My s390 assembly is not particularly good so I have read the history of the reference to do_exit copy_thread and have been able to verify that do_exit is not used. The general argument is that s390 has been changed to use the generic kernel_thread and kernel_execve and the generic versions do not call do_exit. So it is strange to see a do_exit reference sitting there. The history of the do_exit reference in s390's version of copy_thread seems conclusive that the do_exit reference is something that lingers and should have been removed several years ago. Up through 8d19f15a60be ("[PATCH] s390 update (1/27): arch.") the s390 code made a call to the exit(2) system call when a kernel thread finished. Then kernel_thread_starter was added which branched directly to the value in register 11 when the kernel thread finshed. The value in register 11 was set in kernel_thread to "regs.gprs[11] = (unsigned long) do_exit" In commit 37fe5d41f640 ("s390: fold kernel_thread_helper() into ret_from_fork()") kernel_thread_starter was moved into entry.S and entry64.S unchanged (except for the syntax differences between inline assemly and in the assembly file). In commit f9a7e025dfc2 ("s390: switch to generic kernel_thread()") the assignment to "gprs[11]" was moved into copy_thread from the old kernel_thread. The helper kernel_thread_starter was still being used and was still branching to "%r11" at the end. In commit 30dcb0996e40 ("s390: switch to saner kernel_execve() semantics") kernel_thread_starter was changed to unconditionally branch to sysc_tracenogo instead to %r11 which held the value of do_exit. Unfortunately copy_thread was not updated to stop passing do_exit in "gprs[11]". In commit 56e62a737028 ("s390: convert to generic entry") kernel_thread_starter was replaced by __ret_from_fork. And the code still continued to pass do_exit in "gprs[11]" despite __ret_from_fork not caring in the slightest. Remove this dead reference to do_exit to make it clear that s390 is not doing anything with do_exit in copy_thread. Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Fixes: 30dcb0996e40 ("s390: switch to saner kernel_execve() semantics") History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | | Merge branch 'akpm' (patches from Andrew)Linus Torvalds2022-01-152-2/+4
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge misc updates from Andrew Morton: "146 patches. Subsystems affected by this patch series: kthread, ia64, scripts, ntfs, squashfs, ocfs2, vfs, and mm (slab-generic, slab, kmemleak, dax, kasan, debug, pagecache, gup, shmem, frontswap, memremap, memcg, selftests, pagemap, dma, vmalloc, memory-failure, hugetlb, userfaultfd, vmscan, mempolicy, oom-kill, hugetlbfs, migration, thp, ksm, page-poison, percpu, rmap, zswap, zram, cleanups, hmm, and damon)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (146 commits) mm/damon: hide kernel pointer from tracepoint event mm/damon/vaddr: hide kernel pointer from damon_va_three_regions() failure log mm/damon/vaddr: use pr_debug() for damon_va_three_regions() failure logging mm/damon/dbgfs: remove an unnecessary variable mm/damon: move the implementation of damon_insert_region to damon.h mm/damon: add access checking for hugetlb pages Docs/admin-guide/mm/damon/usage: update for schemes statistics mm/damon/dbgfs: support all DAMOS stats Docs/admin-guide/mm/damon/reclaim: document statistics parameters mm/damon/reclaim: provide reclamation statistics mm/damon/schemes: account how many times quota limit has exceeded mm/damon/schemes: account scheme actions that successfully applied mm/damon: remove a mistakenly added comment for a future feature Docs/admin-guide/mm/damon/usage: update for kdamond_pid and (mk|rm)_contexts Docs/admin-guide/mm/damon/usage: mention tracepoint at the beginning Docs/admin-guide/mm/damon/usage: remove redundant information Docs/admin-guide/mm/damon/usage: update for scheme quotas and watermarks mm/damon: convert macro functions to static inline functions mm/damon: modify damon_rand() macro to static inline function mm/damon: move damon_rand() definition into damon.h ...
| * | mm/mempolicy: wire up syscall set_mempolicy_home_nodeAneesh Kumar K.V2022-01-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Link: https://lkml.kernel.org/r/20211202123810.267175-4-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: <linux-api@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | mm: defer kmemleak object creation of module_alloc()Kefeng Wang2022-01-151-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yongqiang reports a kmemleak panic when module insmod/rmmod with KASAN enabled(without KASAN_VMALLOC) on x86[1]. When the module area allocates memory, it's kmemleak_object is created successfully, but the KASAN shadow memory of module allocation is not ready, so when kmemleak scan the module's pointer, it will panic due to no shadow memory with KASAN check. module_alloc __vmalloc_node_range kmemleak_vmalloc kmemleak_scan update_checksum kasan_module_alloc kmemleak_ignore Note, there is no problem if KASAN_VMALLOC enabled, the modules area entire shadow memory is preallocated. Thus, the bug only exits on ARCH which supports dynamic allocation of module area per module load, for now, only x86/arm64/s390 are involved. Add a VM_DEFER_KMEMLEAK flags, defer vmalloc'ed object register of kmemleak in module_alloc() to fix this issue. [1] https://lore.kernel.org/all/6d41e2b9-4692-5ec4-b1cd-cbe29ae89739@huawei.com/ [wangkefeng.wang@huawei.com: fix build] Link: https://lkml.kernel.org/r/20211125080307.27225-1-wangkefeng.wang@huawei.com [akpm@linux-foundation.org: simplify ifdefs, per Andrey] Link: https://lkml.kernel.org/r/CA+fCnZcnwJHUQq34VuRxpdoY6_XbJCDJ-jopksS5Eia4PijPzw@mail.gmail.com Link: https://lkml.kernel.org/r/20211124142034.192078-1-wangkefeng.wang@huawei.com Fixes: 793213a82de4 ("s390/kasan: dynamic shadow mem allocation for modules") Fixes: 39d114ddc682 ("arm64: add KASAN support") Fixes: bebf56a1b176 ("kasan: enable instrumentation of global variables") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reported-by: Yongqiang Liu <liuyongqiang13@huawei.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge tag 's390-5.17-1' of ↵Linus Torvalds2022-01-108-52/+44
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: "Besides all the small improvements and cleanups the most notable part is the fast vector/SIMD implementation of the ChaCha20 stream cipher, which is an adaptation of Andy Polyakov's code for the kernel. Summary: - add fast vector/SIMD implementation of the ChaCha20 stream cipher, which mainly adapts Andy Polyakov's code for the kernel - add status attribute to AP queue device so users can easily figure out its status - fix race in page table release code, and and lots of documentation - remove uevent suppress from cio device driver, since it turned out that it generated more problems than it solved problems - quite a lot of virtual vs physical address confusion fixes - various other small improvements and cleanups all over the place" * tag 's390-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (39 commits) s390/dasd: use default_groups in kobj_type s390/sclp_sd: use default_groups in kobj_type s390/pci: simplify __pciwb_mio() inline asm s390: remove unused TASK_SIZE_OF s390/crash_dump: fix virtual vs physical address handling s390/crypto: fix compile error for ChaCha20 module s390/mm: check 2KB-fragment page on release s390/mm: better annotate 2KB pagetable fragments handling s390/mm: fix 2KB pgtable release race s390/sclp: release SCLP early buffer after kernel initialization s390/nmi: disable interrupts on extended save area update s390/zcrypt: CCA control CPRB sending s390/disassembler: update opcode table s390/uv: fix memblock virtual vs physical address confusion s390/smp: fix memblock_phys_free() vs memblock_free() confusion s390/sclp: fix memblock_phys_free() vs memblock_free() confusion s390/exit: remove dead reference to do_exit from copy_thread s390/ap: add missing virt_to_phys address conversion s390/pgalloc: use pointers instead of unsigned long values s390/pgalloc: add virt/phys address handling to base asce functions ...
| * | | s390/crash_dump: fix virtual vs physical address handlingHeiko Carstens2021-12-202-12/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signal processor STORE STATUS requires a physical address where register contents are supposed to be written to, however the kernel must read the data via the corresponding virtual address. Also the allocated save_area, where register contents are copied to, resides in virtual address space. Fix this by using proper __pa() conversion, or correct memblock_alloc() invocation. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * | | s390/nmi: disable interrupts on extended save area updateAlexander Gordeev2021-12-163-30/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updating of the pointer to machine check extended save area on the IPL CPU needs the lowcore protection to be disabled. Disable interrupts while the protection is off to avoid unnoticed writes to the lowcore. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * | | s390/disassembler: update opcode tableHeiko Carstens2021-12-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sync with binutils: update opcode table to reflect the instruction format update of the lpswey instruction, and add the qpaci instruction. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * | | s390/uv: fix memblock virtual vs physical address confusionHeiko Carstens2021-12-161-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memblock_alloc_try_nid() returns a virtual address, however in error case the allocated memory is incorrectly freed with memblock_phys_free(). Properly use memblock_free() instead, and pass a physical address to uv_init() to fix this. Note: this doesn't fix a bug currently, since virtual and physical addresses are identical. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * | | s390/smp: fix memblock_phys_free() vs memblock_free() confusionHeiko Carstens2021-12-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memblock_phys_free() is used on a virtual address. Fix this by using memblock_free(). Note: this doesn't fix a bug currently, since virtual and physical addresses are identical. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * | | s390/exit: remove dead reference to do_exit from copy_threadEric W. Biederman2021-12-161-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | My s390 assembly is not particularly good so I have read the history of the reference to do_exit copy_thread and have been able to verify that do_exit is not used. The general argument is that s390 has been changed to use the generic kernel_thread and kernel_execve and the generic versions do not call do_exit. So it is strange to see a do_exit reference sitting there. The history of the do_exit reference in s390's version of copy_thread seems conclusive that the do_exit reference is something that lingers and should have been removed several years ago. Up through 8d19f15a60be ("[PATCH] s390 update (1/27): arch.") the s390 code made a call to the exit(2) system call when a kernel thread finished. Then kernel_thread_starter was added which branched directly to the value in register 11 when the kernel thread finshed. The value in register 11 was set in kernel_thread to "regs.gprs[11] = (unsigned long) do_exit" In commit 37fe5d41f640 ("s390: fold kernel_thread_helper() into ret_from_fork()") kernel_thread_starter was moved into entry.S and entry64.S unchanged (except for the syntax differences between inline assemly and in the assembly file). In commit f9a7e025dfc2 ("s390: switch to generic kernel_thread()") the assignment to "gprs[11]" was moved into copy_thread from the old kernel_thread. The helper kernel_thread_starter was still being used and was still branching to "%r11" at the end. In commit 30dcb0996e40 ("s390: switch to saner kernel_execve() semantics") kernel_thread_starter was changed to unconditionally branch to sysc_tracenogo instead to %r11 which held the value of do_exit. Unfortunately copy_thread was not updated to stop passing do_exit in "gprs[11]". In commit 56e62a737028 ("s390: convert to generic entry") kernel_thread_starter was replaced by __ret_from_fork. And the code still continued to pass do_exit in "gprs[11]" despite __ret_from_fork not caring in the slightest. Remove this dead reference to do_exit to make it clear that s390 is not doing anything with do_exit in copy_thread. History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Fixes: 30dcb0996e40 ("s390: switch to saner kernel_execve() semantics") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Link: https://lore.kernel.org/r/20211208202532.16409-1-ebiederm@xmission.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * | | s390/nmi: add missing __pa/__va address conversion of extended save areaHeiko Carstens2021-12-063-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add missing __pa/__va address conversion of machine check extended save area designation, which is an absolute address. Note: this currently doesn't fix a real bug, since virtual addresses are indentical to physical ones. Reported-by: Vineeth Vijayan <vneethv@linux.ibm.com> Tested-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* | | | Merge tag 'arm64-upstream' of ↵Linus Torvalds2022-01-101-2/+1
|\ \ \ \ | |_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - KCSAN enabled for arm64. - Additional kselftests to exercise the syscall ABI w.r.t. SVE/FPSIMD. - Some more SVE clean-ups and refactoring in preparation for SME support (scalable matrix extensions). - BTI clean-ups (SYM_FUNC macros etc.) - arm64 atomics clean-up and codegen improvements. - HWCAPs for FEAT_AFP (alternate floating point behaviour) and FEAT_RPRESS (increased precision of reciprocal estimate and reciprocal square root estimate). - Use SHA3 instructions to speed-up XOR. - arm64 unwind code refactoring/unification. - Avoid DC (data cache maintenance) instructions when DCZID_EL0.DZP == 1 (potentially set by a hypervisor; user-space already does this). - Perf updates for arm64: support for CI-700, HiSilicon PCIe PMU, Marvell CN10K LLC-TAD PMU, miscellaneous clean-ups. - Other fixes and clean-ups; highlights: fix the handling of erratum 1418040, correct the calculation of the nomap region boundaries, introduce io_stop_wc() mapped to the new DGH instruction (data gathering hint). * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (81 commits) arm64: Use correct method to calculate nomap region boundaries arm64: Drop outdated links in comments arm64: perf: Don't register user access sysctl handler multiple times drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check perf/smmuv3: Fix unused variable warning when CONFIG_OF=n arm64: errata: Fix exec handling in erratum 1418040 workaround arm64: Unhash early pointer print plus improve comment asm-generic: introduce io_stop_wc() and add implementation for ARM64 arm64: Ensure that the 'bti' macro is defined where linkage.h is included arm64: remove __dma_*_area() aliases docs/arm64: delete a space from tagged-address-abi arm64: Enable KCSAN kselftest/arm64: Add pidbench for floating point syscall cases arm64/fp: Add comments documenting the usage of state restore functions kselftest/arm64: Add a test program to exercise the syscall ABI kselftest/arm64: Allow signal tests to trigger from a function kselftest/arm64: Parameterise ptrace vector length information arm64/sve: Minor clarification of ABI documentation arm64/sve: Generalise vector length configuration prctl() for SME arm64/sve: Make sysctl interface for SVE reusable by SME ...
| * | | arch: Make ARCH_STACKWALK independent of STACKTRACEPeter Zijlstra2021-12-101-2/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make arch_stack_walk() available for ARCH_STACKWALK architectures without it being entangled in STACKTRACE. Link: https://lore.kernel.org/lkml/20211022152104.356586621@infradead.org/ Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> [Mark: rebase, drop unnecessary arm change] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20211129142849.3056714-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* | | s390/entry: fix duplicate tracking of irq nesting levelSven Schnelle2021-12-121-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the current code, when exiting from idle, rcu_irq_enter() is called twice during irq entry: irq_entry_enter()-> rcu_irq_enter() irq_enter() -> rcu_irq_enter() This may lead to wrong results from rcu_is_cpu_rrupt_from_idle() because of a wrong dynticks nmi nesting count. Fix this by only calling irq_enter_rcu(). Cc: <stable@vger.kernel.org> # 5.12+ Reported-by: Mark Rutland <mark.rutland@arm.com> Fixes: 56e62a737028 ("s390: convert to generic entry") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* | | s390/kexec: handle R_390_PLT32DBL rela in arch_kexec_apply_relocations_add()Alexander Egorenkov2021-12-101-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Starting with gcc 11.3, the C compiler will generate PLT-relative function calls even if they are local and do not require it. Later on during linking, the linker will replace all PLT-relative calls to local functions with PC-relative ones. Unfortunately, the purgatory code of kexec/kdump is not being linked as a regular executable or shared library would have been, and therefore, all PLT-relative addresses remain in the generated purgatory object code unresolved. This leads to the situation where the purgatory code is being executed during kdump with all PLT-relative addresses unresolved. And this results in endless loops within the purgatory code. Furthermore, the clang C compiler has always behaved like described above and this commit should fix kdump for kernels built with the latter. Because the purgatory code is no regular executable or shared library, contains only calls to local functions and has no PLT, all R_390_PLT32DBL relocation entries can be resolved just like a R_390_PC32DBL one. * https://refspecs.linuxfoundation.org/ELF/zSeries/lzsabi0_zSeries/x1633.html#AEN1699 Relocation entries of purgatory code generated with gcc 11.3 ------------------------------------------------------------ $ readelf -r linux/arch/s390/purgatory/purgatory.o Relocation section '.rela.text' at offset 0x370 contains 5 entries: Offset Info Type Sym. Value Sym. Name + Addend 00000000005c 000c00000013 R_390_PC32DBL 0000000000000000 purgatory_sha_regions + 2 00000000007a 000d00000014 R_390_PLT32DBL 0000000000000000 sha256_update + 2 00000000008c 000e00000014 R_390_PLT32DBL 0000000000000000 sha256_final + 2 000000000092 000800000013 R_390_PC32DBL 0000000000000000 .LC0 + 2 0000000000a0 000f00000014 R_390_PLT32DBL 0000000000000000 memcmp + 2 Relocation entries of purgatory code generated with gcc 11.2 ------------------------------------------------------------ $ readelf -r linux/arch/s390/purgatory/purgatory.o Relocation section '.rela.text' at offset 0x368 contains 5 entries: Offset Info Type Sym. Value Sym. Name + Addend 00000000005c 000c00000013 R_390_PC32DBL 0000000000000000 purgatory_sha_regions + 2 00000000007a 000d00000013 R_390_PC32DBL 0000000000000000 sha256_update + 2 00000000008c 000e00000013 R_390_PC32DBL 0000000000000000 sha256_final + 2 000000000092 000800000013 R_390_PC32DBL 0000000000000000 .LC0 + 2 0000000000a0 000f00000013 R_390_PC32DBL 0000000000000000 memcmp + 2 Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reported-by: Tao Liu <ltao@redhat.com> Suggested-by: Philipp Rudo <prudo@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211209073817.82196-1-egorenar@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* | | s390/ftrace: remove preempt_disable()/preempt_enable() pairJerome Marchand2021-12-101-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It looks like commit ce5e48036c9e76a2 ("ftrace: disable preemption when recursion locked") missed a spot in kprobe_ftrace_handler() in arch/s390/kernel/ftrace.c. Remove the superfluous preempt_disable/enable_notrace() there too. Fixes: ce5e48036c9e76a2 ("ftrace: disable preemption when recursion locked") Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Link: https://lore.kernel.org/r/20211208151503.1510381-1-jmarchan@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* | | s390/kexec_file: fix error handling when applying relocationsPhilipp Rudo2021-12-101-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch_kexec_apply_relocations_add currently ignores all errors returned by arch_kexec_do_relocs. This means that every unknown relocation is silently skipped causing unpredictable behavior while the relocated code runs. Fix this by checking for errors and fail kexec_file_load if an unknown relocation type is encountered. The problem was found after gcc changed its behavior and used R_390_PLT32DBL relocations for brasl instruction and relied on ld to resolve the relocations in the final link in case direct calls are possible. As the purgatory code is only linked partially (option -r) ld didn't resolve the relocations leaving them for arch_kexec_do_relocs. But arch_kexec_do_relocs doesn't know how to handle R_390_PLT32DBL relocations so they were silently skipped. This ultimately caused an endless loop in the purgatory as the brasl instructions kept branching to itself. Fixes: 71406883fd35 ("s390/kexec_file: Add kexec_file_load system call") Reported-by: Tao Liu <ltao@redhat.com> Signed-off-by: Philipp Rudo <prudo@redhat.com> Link: https://lore.kernel.org/r/20211208130741.5821-3-prudo@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* | | s390/kexec_file: print some more error messagesPhilipp Rudo2021-12-101-3/+24
|/ / | | | | | | | | | | | | | | Be kind and give some more information on what went wrong. Signed-off-by: Philipp Rudo <prudo@redhat.com> Link: https://lore.kernel.org/r/20211208130741.5821-2-prudo@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* | Merge tag 's390-5.16-3' of ↵Linus Torvalds2021-11-207-25/+32
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: - Add missing Kconfig option for ftrace direct multi sample, so it can be compiled again, and also add s390 support for this sample. - Update Christian Borntraeger's email address. - Various fixes for memory layout setup. Besides other this makes it possible to load shared DCSS segments again. - Fix copy to user space of swapped kdump oldmem. - Remove -mstack-guard and -mstack-size compile options when building vdso binaries. This can happen when CONFIG_VMAP_STACK is disabled and results in broken vdso code which causes more or less random exceptions. Also remove the not needed -nostdlib option. - Fix memory leak on cpu hotplug and return code handling in kexec code. - Wire up futex_waitv system call. - Replace snprintf with sysfs_emit where appropriate. * tag 's390-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: ftrace/samples: add s390 support for ftrace direct multi sample ftrace/samples: add missing Kconfig option for ftrace direct multi sample MAINTAINERS: update email address of Christian Borntraeger s390/kexec: fix memory leak of ipl report buffer s390/kexec: fix return code handling s390/dump: fix copying to user-space of swapped kdump oldmem s390: wire up sys_futex_waitv system call s390/vdso: filter out -mstack-guard and -mstack-size s390/vdso: remove -nostdlib compiler flag s390: replace snprintf in show functions with sysfs_emit s390/boot: simplify and fix kernel memory layout setup s390/setup: re-arrange memblock setup s390/setup: avoid using memblock_enforce_memory_limit s390/setup: avoid reserving memory above identity mapping
| * s390/kexec: fix memory leak of ipl report bufferBaoquan He2021-11-181-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unreferenced object 0x38000195000 (size 4096): comm "kexec", pid 8548, jiffies 4294953647 (age 32443.270s) hex dump (first 32 bytes): 00 00 00 c8 20 00 00 00 00 00 00 c0 02 80 00 00 .... ........... 40 40 40 40 40 40 40 40 00 00 00 00 00 00 00 00 @@@@@@@@........ backtrace: [<0000000011a2f199>] __vmalloc_node_range+0xc0/0x140 [<0000000081fa2752>] vzalloc+0x5a/0x70 [<0000000063a4c92d>] ipl_report_finish+0x2c/0x180 [<00000000553304da>] kexec_file_add_ipl_report+0xf4/0x150 [<00000000862d033f>] kexec_file_add_components+0x124/0x160 [<000000000d2717bb>] arch_kexec_kernel_image_load+0x62/0x90 [<000000002e0373b6>] kimage_file_alloc_init+0x1aa/0x2e0 [<0000000060f2d14f>] __do_sys_kexec_file_load+0x17c/0x2c0 [<000000008c86fe5a>] __s390x_sys_kexec_file_load+0x40/0x50 [<000000001fdb9dac>] __do_syscall+0x1bc/0x1f0 [<000000003ee4258d>] system_call+0x78/0xa0 Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Fixes: 99feaa717e55 ("s390/kexec_file: Create ipl report and pass to next kernel") Cc: <stable@vger.kernel.org> # v5.2: 20c76e242e70: s390/kexec: fix return code handling Cc: <stable@vger.kernel.org> # v5.2 Link: https://lore.kernel.org/r/20211116033101.GD21646@MiWiFi-R3L-srv Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * s390/kexec: fix return code handlingHeiko Carstens2021-11-182-2/+9
| | | | | | | | | | | | | | | | | | | | kexec_file_add_ipl_report ignores that ipl_report_finish may fail and can return an error pointer instead of a valid pointer. Fix this and simplify by returning NULL in case of an error and let the only caller handle this case. Fixes: 99feaa717e55 ("s390/kexec_file: Create ipl report and pass to next kernel") Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * s390/dump: fix copying to user-space of swapped kdump oldmemAlexander Egorenkov2021-11-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit fixes a bug introduced by commit e9e7870f90e3 ("s390/dump: introduce boot data 'oldmem_data'"). OLDMEM_BASE was mistakenly replaced by oldmem_data.size instead of oldmem_data.start. This bug caused the following error during kdump: kdump.sh[878]: No program header covering vaddr 0x3434f5245found kexec bug? Fixes: e9e7870f90e3 ("s390/dump: introduce boot data 'oldmem_data'") Cc: stable@vger.kernel.org # 5.15+ Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * s390: wire up sys_futex_waitv system callVasily Gorbik2021-11-161-0/+1
| | | | | | | | | | | | | | Tested with futex kselftests. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * s390/vdso: filter out -mstack-guard and -mstack-sizeSven Schnelle2021-11-161-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_VMAP_STACK is disabled, the user can enable CONFIG_STACK_CHECK, which adds a stack overflow check to each C function in the kernel. This is also done for functions in the vdso page. These functions are run in user context and user stack sizes are usually different to what the kernel uses. This might trigger the stack check although the stack size is valid. Therefore filter the -mstack-guard and -mstack-size flags when compiling vdso C files. Cc: stable@kernel.org # 5.10+ Fixes: 4bff8cb54502 ("s390: convert to GENERIC_VDSO") Reported-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * s390/vdso: remove -nostdlib compiler flagMasahiro Yamada2021-11-162-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | The -nostdlib option requests the compiler to not use the standard system startup files or libraries when linking. It is effective only when $(CC) is used as a linker driver. Since commit 2b2a25845d53 ("s390/vdso: Use $(LD) instead of $(CC) to link vDSO"), $(LD) is directly used, hence -nostdlib is unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20211107162111.323701-1-masahiroy@kernel.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * s390/setup: re-arrange memblock setupVasily Gorbik2021-11-161-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Avoid using ULONG_MAX in memblock_remove, it has no functional change but makes memblock_dbg output a range which makes sense. - Actually finish memblock memory setup before doing amode31/cr/uv setup. - Move memblock_dump_all() debug output after memblock memory setup is complete. This gives us final "memory" regions if they were trimmed due to addressing limits and still "physmem" regions as original info which came from mem_detect. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * s390/setup: avoid using memblock_enforce_memory_limitVasily Gorbik2021-11-161-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a difference in how architectures treat "mem=" option. For some that is an amount of online memory, for s390 and x86 this is the limiting max address. Some memblock api like memblock_enforce_memory_limit() take limit argument and explicitly treat it as the size of online memory, and use __find_max_addr to convert it to an actual max address. Current s390 usage: memblock_enforce_memory_limit(memblock_end_of_DRAM()); yields different results depending on presence of memory holes (offline memory blocks in between online memory). If there are no memory holes limit == max_addr in memblock_enforce_memory_limit() and it does trim online memory and reserved memory regions. With memory holes present it actually does nothing. Since we already use memblock_remove() explicitly to trim online memory regions to potential limit (think mem=, kdump, addressing limits, etc.) drop the usage of memblock_enforce_memory_limit() altogether. Trimming reserved regions should not be required, since we now use memblock_set_current_limit() to limit allocations and any explicit memory reservations above the limit is an actual problem we should not hide. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
| * s390/setup: avoid reserving memory above identity mappingVasily Gorbik2021-11-161-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Such reserved memory region, if not cleaned up later causes problems when memblock_free_all() is called to release free pages to the buddy allocator and those reserved regions are carried over to reserve_bootmem_region() which marks the pages as PageReserved. Instead use memblock_set_current_limit() to make sure memblock allocations do not go over identity mapping (which could happen when "mem=" option is used or during kdump). Cc: stable@vger.kernel.org Fixes: 73045a08cf55 ("s390: unify identity mapping limits handling") Reported-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* | signal: Replace force_fatal_sig with force_exit_sig when in doubtEric W. Biederman2021-11-191-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recently to prevent issues with SECCOMP_RET_KILL and similar signals being changed before they are delivered SA_IMMUTABLE was added. Unfortunately this broke debuggers[1][2] which reasonably expect to be able to trap synchronous SIGTRAP and SIGSEGV even when the target process is not configured to handle those signals. Add force_exit_sig and use it instead of force_fatal_sig where historically the code has directly called do_exit. This has the implementation benefits of going through the signal exit path (including generating core dumps) without the danger of allowing userspace to ignore or change these signals. This avoids userspace regressions as older kernels exited with do_exit which debuggers also can not intercept. In the future is should be possible to improve the quality of implementation of the kernel by changing some of these force_exit_sig calls to force_fatal_sig. That can be done where it matters on a case-by-case basis with careful analysis. Reported-by: Kyle Huey <me@kylehuey.com> Reported-by: kernel test robot <oliver.sang@intel.com> [1] https://lkml.kernel.org/r/CAP045AoMY4xf8aC_4QU_-j7obuEPYgTcnQQP3Yxk=2X90jtpjw@mail.gmail.com [2] https://lkml.kernel.org/r/20211117150258.GB5403@xsang-OptiPlex-9020 Fixes: 00b06da29cf9 ("signal: Add SA_IMMUTABLE to ensure forced siganls do not get changed") Fixes: a3616a3c0272 ("signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die") Fixes: 83a1f27ad773 ("signal/powerpc: On swapcontext failure force SIGSEGV") Fixes: 9bc508cf0791 ("signal/s390: Use force_sigsegv in default_trap_handler") Fixes: 086ec444f866 ("signal/sparc32: In setup_rt_frame and setup_fram use force_fatal_sig") Fixes: c317d306d550 ("signal/sparc32: Exit with a fatal signal when try_to_clear_window_buffer fails") Fixes: 695dd0d634df ("signal/x86: In emulate_vsyscall force a signal instead of calling do_exit") Fixes: 1fbd60df8a85 ("signal/vm86_32: Properly send SIGSEGV when the vm86 state cannot be saved.") Fixes: 941edc5bf174 ("exit/syscall_user_dispatch: Send ordinary signals on failure") Link: https://lkml.kernel.org/r/871r3dqfv8.fsf_-_@email.froward.int.ebiederm.org Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Kees Cook <keescook@chromium.org> Tested-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* Merge tag 's390-5.16-2' of ↵Linus Torvalds2021-11-131-1/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Vasily Gorbik: - Add PCI automatic error recovery. - Fix tape driver timer initialization broken during timers api cleanup. - Fix bogus CPU measurement counters values on CPUs offlining. - Check the validity of subchanel before reading other fields in the schib in cio code. * tag 's390-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cio: check the subchannel validity for dev_busid s390/cpumf: cpum_cf PMU displays invalid value after hotplug remove s390/tape: fix timer initialization in tape_std_assign() s390/pci: implement minimal PCI error recovery PCI: Export pci_dev_lock() s390/pci: implement reset_slot for hotplug slot s390/pci: refresh function handle in iomap
| * s390/cpumf: cpum_cf PMU displays invalid value after hotplug removeThomas Richter2021-11-081-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a CPU is hotplugged while the perf stat -e cycles command is running, a wrong (very large) value is displayed immediately after the CPU removal: Check the values, shouldn't be too high as in time counts unit events 1.001101919 29261846 cycles 2.002454499 17523405 cycles 3.003659292 24361161 cycles 4.004816983 18446744073638406144 cycles 5.005671647 <not counted> cycles ... The CPU hotplug off took place after 3 seconds. The issue is the read of the event count value after 4 seconds when the CPU is not available and the read of the counter returns an error. This is treated as a counter value of zero. This results in a very large value (0 - previous_value). Fix this by detecting the hotplugged off CPU and report 0 instead of a very large number. Cc: stable@vger.kernel.org Fixes: a029a4eab39e ("s390/cpumf: Allow concurrent access for CPU Measurement Counter Facility") Reported-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* | Merge branch 'exit-cleanups-for-v5.16' of ↵Linus Torvalds2021-11-102-2/+2
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull exit cleanups from Eric Biederman: "While looking at some issues related to the exit path in the kernel I found several instances where the code is not using the existing abstractions properly. This set of changes introduces force_fatal_sig a way of sending a signal and not allowing it to be caught, and corrects the misuse of the existing abstractions that I found. A lot of the misuse of the existing abstractions are silly things such as doing something after calling a no return function, rolling BUG by hand, doing more work than necessary to terminate a kernel thread, or calling do_exit(SIGKILL) instead of calling force_sig(SIGKILL). In the review a deficiency in force_fatal_sig and force_sig_seccomp where ptrace or sigaction could prevent the delivery of the signal was found. I have added a change that adds SA_IMMUTABLE to change that makes it impossible to interrupt the delivery of those signals, and allows backporting to fix force_sig_seccomp And Arnd found an issue where a function passed to kthread_run had the wrong prototype, and after my cleanup was failing to build." * 'exit-cleanups-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (23 commits) soc: ti: fix wkup_m3_rproc_boot_thread return type signal: Add SA_IMMUTABLE to ensure forced siganls do not get changed signal: Replace force_sigsegv(SIGSEGV) with force_fatal_sig(SIGSEGV) exit/r8188eu: Replace the macro thread_exit with a simple return 0 exit/rtl8712: Replace the macro thread_exit with a simple return 0 exit/rtl8723bs: Replace the macro thread_exit with a simple return 0 signal/x86: In emulate_vsyscall force a signal instead of calling do_exit signal/sparc32: In setup_rt_frame and setup_fram use force_fatal_sig signal/sparc32: Exit with a fatal signal when try_to_clear_window_buffer fails exit/syscall_user_dispatch: Send ordinary signals on failure signal: Implement force_fatal_sig exit/kthread: Have kernel threads return instead of calling do_exit signal/s390: Use force_sigsegv in default_trap_handler signal/vm86_32: Properly send SIGSEGV when the vm86 state cannot be saved. signal/vm86_32: Replace open coded BUG_ON with an actual BUG_ON signal/sparc: In setup_tsb_params convert open coded BUG into BUG signal/powerpc: On swapcontext failure force SIGSEGV signal/sh: Use force_sig(SIGKILL) instead of do_group_exit(SIGKILL) signal/mips: Update (_save|_restore)_fp_context to fail with -EFAULT signal/sparc32: Remove unreachable do_exit in do_sparc_fault ...
| * signal: Replace force_sigsegv(SIGSEGV) with force_fatal_sig(SIGSEGV)Eric W. Biederman2021-10-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | Now that force_fatal_sig exists it is unnecessary and a bit confusing to use force_sigsegv in cases where the simpler force_fatal_sig is wanted. So change every instance we can to make the code clearer. Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Link: https://lkml.kernel.org/r/877de7jrev.fsf@disp2133 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * signal/s390: Use force_sigsegv in default_trap_handlerEric W. Biederman2021-10-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reading the history it is unclear why default_trap_handler calls do_exit. It is not even menthioned in the commit where the change happened. My best guess is that because it is unknown why the exception happened it was desired to guarantee the process never returned to userspace. Using do_exit(SIGSEGV) has the problem that it will only terminate one thread of a process, leaving the process in an undefined state. Use force_sigsegv(SIGSEGV) instead which effectively has the same behavior except that is uses the ordinary signal mechanism and terminates all threads of a process and is generally well defined. Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: linux-s390@vger.kernel.org Fixes: ca2ab03237ec ("[PATCH] s390: core changes") History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Link: https://lkml.kernel.org/r/20211020174406.17889-11-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
| * exit: Remove calls of do_exit after noreturn versions of dieEric W. Biederman2021-10-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On nds32, openrisc, s390, sh, and xtensa the function die never returns. Mark die __noreturn so that no one expects die to return. Remove the do_exit calls after die as they will never be reached. Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: openrisc@lists.librecores.org Cc: Nick Hu <nickhu@andestech.com> Cc: Greentime Hu <green.hu@gmail.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: linux-sh@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Fixes: 2.3.16 Fixes: 2.3.99-pre8 Fixes: 3f65ce4d141e ("[PATCH] xtensa: Architecture support for Tensilica Xtensa Part 5") Fixes: 664eec400bf8 ("nds32: MMU fault handling and page table management") Fixes: 61e85e367535 ("OpenRISC: Memory management") Link: https://lkml.kernel.org/r/20211020174406.17889-2-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* | Merge tag 's390-5.16-1' of ↵Linus Torvalds2021-11-0622-275/+403
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Add support for ftrace with direct call and ftrace direct call samples. - Add support for kernel command lines longer than current 896 bytes and make its length configurable. - Add support for BEAR enhancement facility to improve last breaking event instruction tracking. - Add kprobes sanity checks and testcases to prevent kprobe in the mid of an instruction. - Allow concurrent access to /dev/hwc for the CPUMF users. - Various ftrace / jump label improvements. - Convert unwinder tests to KUnit. - Add s390_iommu_aperture kernel parameter to tweak the limits on concurrently usable DMA mappings. - Add ap.useirq AP module option which can be used to disable interrupt use. - Add add_disk() error handling support to block device drivers. - Drop arch specific and use generic implementation of strlcpy and strrchr. - Several __pa/__va usages fixes. - Various cio, crypto, pci, kernel doc and other small fixes and improvements all over the code. [ Merge fixup as per https://lore.kernel.org/all/YXAqZ%2FEszRisunQw@osiris/ ] * tag 's390-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (63 commits) s390: make command line configurable s390: support command lines longer than 896 bytes s390/kexec_file: move kernel image size check s390/pci: add s390_iommu_aperture kernel parameter s390/spinlock: remove incorrect kernel doc indicator s390/string: use generic strlcpy s390/string: use generic strrchr s390/ap: function rework based on compiler warning s390/cio: make ccw_device_dma_* more robust s390/vfio-ap: s390/crypto: fix all kernel-doc warnings s390/hmcdrv: fix kernel doc comments s390/ap: new module option ap.useirq s390/cpumf: Allow multiple processes to access /dev/hwc s390/bitops: return true/false (not 1/0) from bool functions s390: add support for BEAR enhancement facility s390: introduce nospec_uses_trampoline() s390: rename last_break to pgm_last_break s390/ptrace: add last_break member to pt_regs s390/sclp: sort out physical vs virtual pointers usage s390/setup: convert start and end initrd pointers to virtual ...
| * | s390: support command lines longer than 896 bytesSven Schnelle2021-10-263-5/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently s390 supports a fixed maximum command line length of 896 bytes. This isn't enough as some installers are trying to pass all configuration data via kernel command line, and even with zfcp alone it is easy to generate really long command lines. Therefore extend the command line to 4 kbytes. In the parm area where the command line is stored there is no indication of the maximum allowed length, so a new field which contains the maximum length is added. The parm area has always been initialized to zero, so with old kernels this field would read zero. This is important because tools like zipl could read this field. If it contains a number larger than zero zipl knows the maximum length that can be stored in the parm area, otherwise it must assume that it is booting a legacy kernel and only 896 bytes are available. The removing of trailing whitespace in head.S is also removed because code to do this is already present in setup_boot_command_line(). Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
| * | s390/kexec_file: move kernel image size checkSven Schnelle2021-10-261-15/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation of adding support for command lines with variable sizes on s390, the check whether the new kernel image is at least HEAD_END bytes long isn't correct. Move the check to kexec_file_add_components() so we can get the size of the parm area and check the size there. The '.org HEAD_END' directive can now also be removed from head.S. This was used in the past to reserve space for the early sccb buffer, but with commit 9a5131b87cac1 ("s390/boot: move sclp early buffer from fixed address in asm to C") this is no longer required. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
| * | s390/cpumf: Allow multiple processes to access /dev/hwcThomas Richter2021-10-261-78/+150
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a029a4eab39e ("s390/cpumf: Allow concurrent access for CPU Measurement Counter Facility") added CPU Measurement counter facility access to multiple consumers. It allows concurrent access to the CPU Measurement counter facility via several perf_event_open() system call invocations and via ioctl() system call of device /dev/hwc. However the access via device /dev/hwc was exclusive, only one process was able to open this device. The patch removes this restriction. Now multiple invocations of lshwc can execute in parallel. They can access different CPUs and counter sets or CPUs and counter set can overlap. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Suggested-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
| * | s390: add support for BEAR enhancement facilitySven Schnelle2021-10-267-16/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Breaking-Event-Address-Register (BEAR) stores the address of the last breaking event instruction. Breaking events are usually instructions that change the program flow - for example branches, and instructions that modify the address in the PSW like lpswe. This is useful for debugging wild branches, because one could easily figure out where the wild branch was originating from. What is problematic is that lpswe is considered a breaking event, and therefore overwrites BEAR on kernel exit. The BEAR enhancement facility adds new instructions that allow to save/restore BEAR and also an lpswey instruction that doesn't cause a breaking event. So we can save BEAR on kernel entry and restore it on exit to user space. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
| * | s390: introduce nospec_uses_trampoline()Sven Schnelle2021-10-262-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | and replace all of the "__is_defined(CC_USING_EXPOLINE) && !nospec_disable" occurrences. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>