summaryrefslogtreecommitdiffstats
path: root/arch/s390/lib/spinlock.c
Commit message (Collapse)AuthorAgeFilesLines
* s390/spinlock: fix indentationHeiko Carstens2017-11-141-3/+4
| | | | | | | | | | | | | checkpatch: WARNING: Statements should start on a tabstop #9499: FILE: arch/s390/lib/spinlock.c:231: + return; sparse: arch/s390/lib/spinlock.c:81 arch_load_niai4() warn: inconsistent indenting Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* Merge branch 'locking-core-for-linus' of ↵Linus Torvalds2017-11-131-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core locking updates from Ingo Molnar: "The main changes in this cycle are: - Another attempt at enabling cross-release lockdep dependency tracking (automatically part of CONFIG_PROVE_LOCKING=y), this time with better performance and fewer false positives. (Byungchul Park) - Introduce lockdep_assert_irqs_enabled()/disabled() and convert open-coded equivalents to lockdep variants. (Frederic Weisbecker) - Add down_read_killable() and use it in the VFS's iterate_dir() method. (Kirill Tkhai) - Convert remaining uses of ACCESS_ONCE() to READ_ONCE()/WRITE_ONCE(). Most of the conversion was Coccinelle driven. (Mark Rutland, Paul E. McKenney) - Get rid of lockless_dereference(), by strengthening Alpha atomics, strengthening READ_ONCE() with smp_read_barrier_depends() and thus being able to convert users of lockless_dereference() to READ_ONCE(). (Will Deacon) - Various micro-optimizations: - better PV qspinlocks (Waiman Long), - better x86 barriers (Michael S. Tsirkin) - better x86 refcounts (Kees Cook) - ... plus other fixes and enhancements. (Borislav Petkov, Juergen Gross, Miguel Bernal Marin)" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits) locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE rcu: Use lockdep to assert IRQs are disabled/enabled netpoll: Use lockdep to assert IRQs are disabled/enabled timers/posix-cpu-timers: Use lockdep to assert IRQs are disabled/enabled sched/clock, sched/cputime: Use lockdep to assert IRQs are disabled/enabled irq_work: Use lockdep to assert IRQs are disabled/enabled irq/timings: Use lockdep to assert IRQs are disabled/enabled perf/core: Use lockdep to assert IRQs are disabled/enabled x86: Use lockdep to assert IRQs are disabled/enabled smp/core: Use lockdep to assert IRQs are disabled/enabled timers/hrtimer: Use lockdep to assert IRQs are disabled/enabled timers/nohz: Use lockdep to assert IRQs are disabled/enabled workqueue: Use lockdep to assert IRQs are disabled/enabled irq/softirqs: Use lockdep to assert IRQs are disabled/enabled locking/lockdep: Add IRQs disabled/enabled assertion APIs: lockdep_assert_irqs_enabled()/disabled() locking/pvqspinlock: Implement hybrid PV queued/unfair locks locking/rwlocks: Fix comments x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized block, locking/lockdep: Assign a lock_class per gendisk used for wait_for_completion() workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes ...
| * Merge branch 'linus' into locking/core, to resolve conflictsIngo Molnar2017-11-071-0/+1
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: include/linux/compiler-clang.h include/linux/compiler-gcc.h include/linux/compiler-intel.h include/uapi/linux/stddef.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns ↵Mark Rutland2017-10-251-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to READ_ONCE()/WRITE_ONCE() Please do not apply this to mainline directly, instead please re-run the coccinelle script shown below and apply its output. For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't harmful, and changing them results in churn. However, for some features, the read/write distinction is critical to correct operation. To distinguish these cases, separate read/write accessors must be used. This patch migrates (most) remaining ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following coccinelle script: ---- // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and // WRITE_ONCE() // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2017-11-131-146/+197
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: "Since Martin is on vacation you get the s390 pull request for the v4.15 merge window this time from me. Besides a lot of cleanups and bug fixes these are the most important changes: - a new regset for runtime instrumentation registers - hardware accelerated AES-GCM support for the aes_s390 module - support for the new CEX6S crypto cards - support for FORTIFY_SOURCE - addition of missing z13 and new z14 instructions to the in-kernel disassembler - generate opcode tables for the in-kernel disassembler out of a simple text file instead of having to manually maintain those tables - fast memset16, memset32 and memset64 implementations - removal of named saved segment support - hardware counter support for z14 - queued spinlocks and queued rwlocks implementations for s390 - use the stack_depth tracking feature for s390 BPF JIT - a new s390_sthyi system call which emulates the sthyi (store hypervisor information) instruction - removal of the old KVM virtio transport - an s390 specific CPU alternatives implementation which is used in the new spinlock code" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (88 commits) MAINTAINERS: add virtio-ccw.h to virtio/s390 section s390/noexec: execute kexec datamover without DAT s390: fix transactional execution control register handling s390/bpf: take advantage of stack_depth tracking s390: simplify transactional execution elf hwcap handling s390/zcrypt: Rework struct ap_qact_ap_info. s390/virtio: remove unused header file kvm_virtio.h s390: avoid undefined behaviour s390/disassembler: generate opcode tables from text file s390/disassembler: remove insn_to_mnemonic() s390/dasd: avoid calling do_gettimeofday() s390: vfio-ccw: Do not attempt to free no-op, test and tic cda. s390: remove named saved segment support s390/archrandom: Reconsider s390 arch random implementation s390/pci: do not require AIS facility s390/qdio: sanitize put_indicator s390/qdio: use atomic_cmpxchg s390/nmi: avoid using long-displacement facility s390: pass endianness info to sparse s390/decompressor: remove informational messages ...
| * | s390/spinlock: use cpu alternatives to enable niai instructionVasily Gorbik2017-10-181-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | Enable niai instruction in the spinlock code at run-time for machines on which facility 49 is available (zEC12 and newer). Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/rwlock: introduce rwlock wait queueingMartin Schwidefsky2017-09-281-109/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like the common queued rwlock code the s390 implementation uses the queued spinlock code on a spinlock_t embedded in the rwlock_t to achieve the queueing. The encoding of the rwlock_t differs though, the counter field in the rwlock_t is split into two parts. The upper two bytes hold the write bit and the write wait counter, the lower two bytes hold the read counter. The arch_read_lock operation works exactly like the common qrwlock but the enqueue operation for a writer follows a diffent logic. After the failed inline try to get the rwlock in write, the writer first increases the write wait counter, acquires the wait spin_lock for the queueing, and then loops until there are no readers and the write bit is zero. Without the write wait counter a CPU that just released the rwlock could immediately reacquire the lock in the inline code, bypassing all outstanding read and write waiters. For s390 this would cause massive imbalances in favour of writers in case of a contended rwlock. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/spinlock: introduce spinlock wait queueingMartin Schwidefsky2017-09-281-30/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The queued spinlock code for s390 follows the principles of the common code qspinlock implementation but with a few notable differences. The format of the spinlock_t locking word differs, s390 needs to store the logical CPU number of the lock holder in the spinlock_t to be able to use the diagnose 9c directed yield hypervisor call. The inline code sequences for spin_lock and spin_unlock are nice and short. The inline portion of a spin_lock now typically looks like this: lhi %r0,0 # 0 indicates an empty lock l %r1,0x3a0 # CPU number + 1 from lowcore cs %r0,%r1,<some_lock> # lock operation jnz call_wait # on failure call wait function locked: ... call_wait: la %r2,<some_lock> brasl %r14,arch_spin_lock_wait j locked A spin_unlock is as simple as before: lhi %r0,0 sth %r0,2(%r2) # unlock operation After a CPU has queued itself it may not enable interrupts again for the arch_spin_lock_flags() variant. The arch_spin_lock_wait_flags wait function is removed. To improve performance the code implements opportunistic lock stealing. If the wait function finds a spinlock_t that indicates that the lock is free but there are queued waiters, the CPU may steal the lock up to three times without queueing itself. The lock stealing update the steal counter in the lock word to prevent more than 3 steals. The counter is reset at the time the CPU next in the queue successfully takes the lock. While the queued spinlocks improve performance in a system with dedicated CPUs, in a virtualized environment with continuously overcommitted CPUs the queued spinlocks can have a negative effect on performance. This is due to the fact that a queued CPU that is preempted by the hypervisor will block the queue at some point even without holding the lock. With the classic spinlock it does not matter if a CPU is preempted that waits for the lock. Therefore use the queued spinlock code only if the system runs with dedicated CPUs and fall back to classic spinlocks when running with shared CPUs. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/spinlock: use the cpu number +1 as spinlock valueMartin Schwidefsky2017-09-281-16/+16
| |/ | | | | | | | | | | | | The queued spinlock code will come out simpler if the encoding of the CPU that holds the spinlock is (cpu+1) instead of (~cpu). Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* / License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman2017-11-021-0/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* s390/spinlock: add niai spinlock hintsMartin Schwidefsky2017-07-261-36/+51
| | | | | | | | | | | | The z14 machine introduces new mode of the next-instruction-access-intent NIAI instruction. With NIAI-8 it is possible to pin a cache-line on a CPU for a small amount of time, NIAI-7 releases the cache-line again. Finally NIAI-4 can be used to prevent the CPU to speculatively access memory beyond the compare-and-swap instruction to get the lock. Use these instruction in the spinlock code. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/spinlock: remove compare and delay instructionMartin Schwidefsky2017-04-121-28/+5
| | | | | | | | | | The CAD instruction never worked quite as expected for the spinlock code. It has been disabled by default with git commit 61b0b01686d48220, if the "cad" kernel parameter is specified it is enabled for both user space and the spinlock code. Leave the option to enable the instruction for user space but remove it from the spinlock code. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/spinlock: use atomic primitives for spinlocksMartin Schwidefsky2017-04-121-38/+35
| | | | | | | | Add a couple more __atomic_xxx function to atomic_ops.h and use them to replace the compare-and-swap inlines in the spinlock code. This changes the type of the lock value from unsigned int to int. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: replace ACCESS_ONCE with READ_ONCEChristian Borntraeger2017-02-171-1/+1
| | | | | | | | Remove the last places of ACCESS_ONCE in s390 code. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: Audit and remove any remaining unnecessary uses of module.hPaul Gortmaker2017-02-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each change instance for the presence of either and replace as needed. An instance where module_param was used without moduleparam.h was also fixed, as well as implicit use of ptrace.h and string.h headers. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* locking/spinlocks, s390: Implement vcpu_is_preempted(cpu)Christian Borntraeger2016-11-221-17/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements the s390 version for vcpu_is_preempted(cpu), by reworking the existing smp_vcpu_scheduled() function into arch_vcpu_is_preempted(). We can then also get rid of the local cpu_is_preempted() function by moving the CIF_ENABLED_WAIT test into arch_vcpu_is_preempted(). Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David.Laight@ACULAB.COM Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: benh@kernel.crashing.org Cc: boqun.feng@gmail.com Cc: bsingharora@gmail.com Cc: dave@stgolabs.net Cc: jgross@suse.com Cc: kernellwp@gmail.com Cc: konrad.wilk@oracle.com Cc: linuxppc-dev@lists.ozlabs.org Cc: mpe@ellerman.id.au Cc: paulmck@linux.vnet.ibm.com Cc: paulus@samba.org Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Cc: virtualization@lists.linux-foundation.org Cc: will.deacon@arm.com Cc: xen-devel-request@lists.xenproject.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1478077718-37424-6-git-send-email-xinhui.pan@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* s390/spinlock: avoid yield to non existent cpuHeiko Carstens2016-04-151-0/+1
| | | | | | | | | | | | | | | | | | | arch_spin_lock_wait_flags() checks if a spinlock is not held before trying a compare and swap instruction. If the lock is unlocked it tries the compare and swap instruction, however if a different cpu grabbed the lock in the meantime the instruction will fail as expected. Subsequently the arch_spin_lock_wait_flags() incorrectly tries to figure out if the cpu that holds the lock is running. However it is using the wrong cpu number for this (-1) and then will also yield the current cpu to the wrong cpu. Fix this by adding a missing continue statement. Fixes: 470ada6b1a1d ("s390/spinlock: refactor arch_spin_lock_wait[_flags]") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/spinlock: do not yield to a CPU in udelay/mdelayMartin Schwidefsky2015-11-271-8/+17
| | | | | | | | | | | | | It does not make sense to try to relinquish the time slice with diag 0x9c to a CPU in a state that does not allow to schedule the CPU. The scenario where this can happen is a CPU waiting in udelay/mdelay while holding a spin-lock. Add a CIF bit to tag a CPU in enabled wait and use it to detect that the yield of a CPU will not be successful and skip the diagnose call. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/spinlock: avoid diagnose loopMartin Schwidefsky2015-11-271-9/+19
| | | | | | | | | | | | | | | | | The spinlock implementation calls the diagnose 0x9c / 0x44 immediately if the SIGP sense running reported the target CPU as not running. The diagnose 0x9c is a hint to the hypervisor to schedule the target CPU in preference to the source CPU that issued the diagnose. It can happen that on return from the diagnose the target CPU has not been scheduled yet, e.g. if the target logical CPU is on another physical CPU and the hypervisor did not want to migrate the logical CPU. Avoid the immediate repeat of the diagnose instruction, instead do the retry loop before the next invocation of diagnose 0x9c. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/spinlock: use correct barriersChristian Borntraeger2015-10-141-2/+2
| | | | | | | | | | | | _raw_write_lock_wait first sets the high order bit to indicate a pending writer and then waits for the reader to drop to zero. smp_rmb by definition only orders reads against reads. Let's use a full smp_mb instead. As right now smp_rmb is implemented as full serialization, this needs no stable backport, but this patch will be necessary if we reimplement smp_rmb. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/spinlock: add compare-and-delay to lock wait loopsMartin Schwidefsky2015-01-231-7/+45
| | | | | | | | | | | | Add the compare-and-delay instruction to the spin-lock and rw-lock retry loops. A CPU executing the compare-and-delay instruction stops until the lock value has changed. This is done to make the locking code for contended locks to behave better in regard to the multi- hreading facility. A thread of a core executing a compare-and-delay will allow the other threads of a core to get a larger share of the core resources. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/rwlock: use the interlocked-access facility 1 instructionsMartin Schwidefsky2014-09-251-0/+34
| | | | | | | Make use of the load-and-add, load-and-or and load-and-and instructions to atomically update the read-write lock without a compare-and-swap loop. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/rwlock: improve writer fairnessMartin Schwidefsky2014-09-251-5/+9
| | | | | | | | | Set the write-lock bit in the out-of-line rwlock code to indicate that a writer is waiting. Additional readers will no be able to get the lock until at least one writer got the lock. Additional writers have to wait for the first writer to release the lock again. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/rwlock: remove interrupt-enabling rwlock variant.Martin Schwidefsky2014-09-251-50/+0
| | | | Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/rwlock: use directed yield for write-locked rwlocksMartin Schwidefsky2014-09-251-19/+30
| | | | | | | | | Add an owner field to the arch_rwlock_t to be able to pass the timeslice of a virtual CPU with diagnose 0x9c to the lock owner in case the rwlock is write-locked. The undirected yield in case the rwlock is acquired writable but the lock is read-locked is removed. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/spinlock: refactor arch_spin_lock_wait[_flags]Martin Schwidefsky2014-05-201-34/+47
| | | | | | Reorder the spinlock wait code to make it more readable. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/rwlock: add missing local_irq_restore callsMartin Schwidefsky2014-05-201-0/+2
| | | | | | | | | | The out of line _raw_read_lock_wait_flags/_raw_write_lock_wait_flags functions for the arch_read_lock_flags/arch_write_lock_flags calls fail to re-enable the interrupts after another unsuccessful try to get the lock with compare-and-swap. The following wait would be done with interrupts disabled which is suboptimal. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/spinlock,rwlock: always to a load-and-test firstMartin Schwidefsky2014-05-201-13/+16
| | | | | | | | | | | | | | | | In case a lock is contended it is better to do a load-and-test first before trying to get the lock with compare-and-swap. This helps to avoid unnecessary cache invalidations of the cacheline for the lock if the CPU has to wait for the lock. For an uncontended lock doing the compare-and-swap directly is a bit better, if the CPU does not have the cacheline in its cache yet the compare-and-swap will get it read-write immediately while a load-and-test would get it read-only first. Always to the load-and-test first to avoid the cacheline invalidations for the contended case outweight the potential read-only to read-write cacheline upgrade for the uncontended case. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/spinlock: fix system hang with spin_retry <= 0Gerald Schaefer2014-05-201-6/+8
| | | | | | | | | | On LPAR, when spin_retry is set to <= 0, arch_spin_lock_wait() and arch_spin_lock_wait_flags() may end up in a while(1) loop w/o doing any compare and swap operation. To fix this, use do/while instead of for loop. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/spinlock: optimize spinlock code sequencePhilipp Hachtmann2014-05-201-2/+2
| | | | | | | | | Use lowcore constant to improve the code generated for spinlocks. [ Martin Schwidefsky: patch breakdown and code beautification ] Signed-off-by: Philipp Hachtmann <phacht@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/spinlock: cleanup spinlock codePhilipp Hachtmann2014-05-201-29/+26
| | | | | | | | | | | | | | Improve the spinlock code in several aspects: - Have _raw_compare_and_swap return true if the operation has been successful instead of returning the old value. - Remove the "volatile" from arch_spinlock_t and arch_rwlock_t - Rename 'owner_cpu' to 'lock' - Add helper functions arch_spin_trylock_once / arch_spin_tryrelease_once [ Martin Schwidefsky: patch breakdown and code beautification ] Signed-off-by: Philipp Hachtmann <phacht@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/comments: unify copyright messages and remove file namesHeiko Carstens2012-07-201-2/+1
| | | | | | | | | | | | | | Remove the file name from the comment at top of many files. In most cases the file name was wrong anyway, so it's rather pointless. Also unify the IBM copyright statement. We did have a lot of sightly different statements and wanted to change them one after another whenever a file gets touched. However that never happened. Instead people start to take the old/"wrong" statements to use as a template for new files. So unify all of them in one go. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* [S390] rework smp codeMartin Schwidefsky2012-03-111-22/+8
| | | | | | | | | | | | | | | | | | | | | Define struct pcpu and merge some of the NR_CPUS arrays into it, including __cpu_logical_map, current_set and smp_cpu_state. Split smp related functions to those operating on physical cpus and the functions operating on a logical cpu number. Make the functions for physical cpus use a pointer to a struct pcpu. This hides the knowledge about cpu addresses in smp.c, entry[64].S and swsusp_asm64.S, thus remove the sigp.h header. The PSW restart mechanism is used to start secondary cpus, calling a function on an online cpu, calling a function on the ipl cpu, and for the nmi signal. Replace the different assembler functions with a single function restart_int_handler. The new entry point calls a function whose pointer is stored in the lowcore of the target cpu and it can wait for the source cpu to stop. This covers all existing use cases. Overall the code is now simpler and there are ~380 lines less code. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] spinlock: check virtual cpu running statusGerald Schaefer2010-02-261-16/+37
| | | | | | | | | | | This patch introduces a new function that checks the running status of a cpu in a hypervisor. This status is not virtualized, so the check is only correct if running in an LPAR. On acquiring a spinlock, if the cpu holding the lock is scheduled by the hypervisor, we do a busy wait on the lock. If it is not scheduled, we yield over to that cpu. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Move __cpu_logical_map to smp.cHeiko Carstens2010-01-131-1/+1
| | | | | | | | Finally move it to the place where it belongs to and make get rid of it for !CONFIG_SMP. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* locking: Convert raw_rwlock functions to arch_rwlockThomas Gleixner2009-12-141-6/+6
| | | | | | | | | | Name space cleanup for rwlock functions. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
* locking: Convert raw_rwlock to arch_rwlockThomas Gleixner2009-12-141-6/+6
| | | | | | | | | | | | | | Not strictly necessary for -rt as -rt does not have non sleeping rwlocks, but it's odd to not have a consistent naming convention. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
* locking: Convert __raw_spin* functions to arch_spin*Thomas Gleixner2009-12-141-11/+11
| | | | | | | | | | Name space cleanup. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
* locking: Convert raw_spinlock to arch_spinlockThomas Gleixner2009-12-141-4/+4
| | | | | | | | | | | | | | | | | | | The raw_spin* namespace was taken by lockdep for the architecture specific implementations. raw_spin_* would be the ideal name space for the spinlocks which are not converted to sleeping locks in preempt-rt. Linus suggested to convert the raw_ to arch_ locks and cleanup the name space instead of using an artifical name like core_spin, atomic_spin or whatever No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
* [S390] implement interrupt-enabling rwlocksHeiko Carstens2009-06-121-0/+40
| | | | | | | | arch backend for f5f7eac41db827a47b2163330eecd7bb55ae9f12 "Allow rwlocks to re-enable interrupts". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] do local_irq_restore while spinning in spin_lock_irqsave.Hisashi Hifumi2008-01-261-0/+23
| | | | | | | | | | | | | | | | | In s390's spin_lock_irqsave, interrupts remain disabled while spinning. In other architectures like x86 and powerpc, interrupts are re-enabled while spinning if IRQ is not masked before spin_lock_irqsave is called. The following patch re-enables interrupts through local_irq_restore while spinning for a lock acquisition. This can improve system response. [heiko.carstens@de.ibm.com: removed saving of pc] Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Remove owner_pc member from raw_spinlock_t.Heiko Carstens2008-01-261-8/+4
| | | | | | | | | | Used to contain the address of the holder of the lock. But since the spinlock code is not inlined anymore all locks contain the same address anyway. And since in addtition nobody complained about that for ages its obviously unused. So remove it. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [PATCH] Directed yield: direct yield of spinlocks for s390.Martin Schwidefsky2006-10-011-23/+39
| | | | | | | | | | | Use the new diagnose 0x9c in the spinlock implementation for s390. It yields the remaining timeslice of the virtual cpu that tries to acquire a lock to the virtual cpu that is the current holder of the lock. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] s390: Increase spinlock retry code performanceChristian Ehrhardt2006-03-091-2/+13
| | | | | | | | | | | | | | Currently the code tries up to spin_retry times to grab a lock using the cs instruction. The cs instruction has exclusive access to a memory region and therefore invalidates the appropiate cache line of all other cpus. If there is contention on a lock this leads to cache line trashing. This can be avoided if we first check wether a cs instruction is likely to succeed before the instruction gets actually executed. Signed-off-by: Christian Ehrhardt <ehrhardt@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] s390: spinlock fixesMartin Schwidefsky2006-01-141-7/+0
| | | | | | | | Remove useless spin_retry_counter and fix compilation for UP kernels. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] s390: cleanup KconfigMartin Schwidefsky2006-01-061-1/+1
| | | | | | | | | | Sanitize some s390 Kconfig options. We have ARCH_S390, ARCH_S390X, ARCH_S390_31, 64BIT, S390_SUPPORT and COMPAT. Replace these 6 options by S390, 64BIT and COMPAT. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] spinlock consolidationIngo Molnar2005-09-101-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch (written by me and also containing many suggestions of Arjan van de Ven) does a major cleanup of the spinlock code. It does the following things: - consolidates and enhances the spinlock/rwlock debugging code - simplifies the asm/spinlock.h files - encapsulates the raw spinlock type and moves generic spinlock features (such as ->break_lock) into the generic code. - cleans up the spinlock code hierarchy to get rid of the spaghetti. Most notably there's now only a single variant of the debugging code, located in lib/spinlock_debug.c. (previously we had one SMP debugging variant per architecture, plus a separate generic one for UP builds) Also, i've enhanced the rwlock debugging facility, it will now track write-owners. There is new spinlock-owner/CPU-tracking on SMP builds too. All locks have lockup detection now, which will work for both soft and hard spin/rwlock lockups. The arch-level include files now only contain the minimally necessary subset of the spinlock code - all the rest that can be generalized now lives in the generic headers: include/asm-i386/spinlock_types.h | 16 include/asm-x86_64/spinlock_types.h | 16 I have also split up the various spinlock variants into separate files, making it easier to see which does what. The new layout is: SMP | UP ----------------------------|----------------------------------- asm/spinlock_types_smp.h | linux/spinlock_types_up.h linux/spinlock_types.h | linux/spinlock_types.h asm/spinlock_smp.h | linux/spinlock_up.h linux/spinlock_api_smp.h | linux/spinlock_api_up.h linux/spinlock.h | linux/spinlock.h /* * here's the role of the various spinlock/rwlock related include files: * * on SMP builds: * * asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the * initializers * * linux/spinlock_types.h: * defines the generic type and initializers * * asm/spinlock.h: contains the __raw_spin_*()/etc. lowlevel * implementations, mostly inline assembly code * * (also included on UP-debug builds:) * * linux/spinlock_api_smp.h: * contains the prototypes for the _spin_*() APIs. * * linux/spinlock.h: builds the final spin_*() APIs. * * on UP builds: * * linux/spinlock_type_up.h: * contains the generic, simplified UP spinlock type. * (which is an empty structure on non-debug builds) * * linux/spinlock_types.h: * defines the generic type and initializers * * linux/spinlock_up.h: * contains the __raw_spin_*()/etc. version of UP * builds. (which are NOPs on non-debug, non-preempt * builds) * * (included on UP-non-debug builds:) * * linux/spinlock_api_up.h: * builds the _spin_*() APIs. * * linux/spinlock.h: builds the final spin_*() APIs. */ All SMP and UP architectures are converted by this patch. arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via crosscompilers. m32r, mips, sh, sparc, have not been tested yet, but should be mostly fine. From: Grant Grundler <grundler@parisc-linux.org> Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU). Builds 32-bit SMP kernel (not booted or tested). I did not try to build non-SMP kernels. That should be trivial to fix up later if necessary. I converted bit ops atomic_hash lock to raw_spinlock_t. Doing so avoids some ugly nesting of linux/*.h and asm/*.h files. Those particular locks are well tested and contained entirely inside arch specific code. I do NOT expect any new issues to arise with them. If someone does ever need to use debug/metrics with them, then they will need to unravel this hairball between spinlocks, atomic ops, and bit ops that exist only because parisc has exactly one atomic instruction: LDCW (load and clear word). From: "Luck, Tony" <tony.luck@intel.com> ia64 fix Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjanv@infradead.org> Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Cc: Matthew Wilcox <willy@debian.org> Signed-off-by: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Mikael Pettersson <mikpe@csd.uu.se> Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] s390: spin lock retryMartin Schwidefsky2005-07-271-0/+133
Split spin lock and r/w lock implementation into a single try which is done inline and an out of line function that repeatedly tries to get the lock before doing the cpu_relax(). Add a system control to set the number of retries before a cpu is yielded. The reason for the spin lock retry is that the diagnose 0x44 that is used to give up the virtual cpu is quite expensive. For spin locks that are held only for a short period of time the costs of the diagnoses outweights the savings for spin locks that are held for a longer timer. The default retry count is 1000. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>