| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
parameters
There is a single npu context per set of callback parameters. Callers
should be prevented from overwriting existing callback values so
instead return an error if different parameters are passed.
Fixes: 1ab66d1fbada ("powerpc/powernv: Introduce address translation services for Nvlink2")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Mark Hairgrove <mhairgrove@nvidia.com>
Tested-by: Mark Hairgrove <mhairgrove@nvidia.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix crashes when loading modules built with a different
CONFIG_RELOCATABLE value by adding CONFIG_RELOCATABLE to vermagic.
- Fix busy loops in the OPAL NVRAM driver if we get certain error
conditions from firmware.
- Remove tlbie trace points from KVM code that's called in real mode,
because it causes crashes.
- Fix checkstops caused by invalid tlbiel on Power9 Radix.
- Ensure the set of CPU features we "know" are always enabled is
actually the minimal set when we build with support for firmware
supplied CPU features.
Thanks to: Aneesh Kumar K.V, Anshuman Khandual, Nicholas Piggin.
* tag 'powerpc-4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Fix CPU_FTRS_ALWAYS vs DT CPU features
powerpc/mm/radix: Fix checkstops caused by invalid tlbiel
KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode
powerpc/8xx: Fix build with hugetlbfs enabled
powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops
powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops
powerpc/fscr: Enable interrupts earlier before calling get_user()
powerpc/64s: Fix section mismatch warnings from setup_rfi_flush()
powerpc/modules: Fix crashes by adding CONFIG_RELOCATABLE to vermagic
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The cpu_has_feature() mechanism has an optimisation where at build
time we construct a mask of the CPU feature bits that will always be
true for the given .config, based on the platform/bitness/etc. that we
are building for.
That is incompatible with DT CPU features, where the set of CPU
features is dependent on feature flags that are given to us by
firmware.
The result is that some feature bits can not be *disabled* by DT CPU
features. Or more accurately, they can be disabled but they will still
appear in the ALWAYS mask, meaning cpu_has_feature() will always
return true for them.
In the past this hasn't really been a problem because on Book3S
64 (where we support DT CPU features), the set of ALWAYS bits has been
very small. That was because we always built for POWER4 and later,
meaning the set of common bits was small.
The only bit that could be cleared by DT CPU features that was also in
the ALWAYS mask was CPU_FTR_NODSISRALIGN, and that was only used in
the alignment handler to create a fake DSISR. That code was itself
deleted in 31bfdb036f12 ("powerpc: Use instruction emulation
infrastructure to handle alignment faults") (Sep 2017).
However the set of ALWAYS features changed with the recent commit
db5ae1c155af ("powerpc/64s: Refine feature sets for little endian
builds") which restricted the set of feature flags when building
little endian to Power7 or later. That caused the ALWAYS mask to
become much larger for little endian builds.
The result is that the following feature bits can currently not
be *disabled* by DT CPU features:
CPU_FTR_REAL_LE, CPU_FTR_MMCRA, CPU_FTR_CTRL, CPU_FTR_SMT,
CPU_FTR_PURR, CPU_FTR_SPURR, CPU_FTR_DSCR, CPU_FTR_PKEY,
CPU_FTR_VMX_COPY, CPU_FTR_CFAR, CPU_FTR_HAS_PPR.
To fix it we need to mask the set of ALWAYS features with the base set
of DT CPU features, ie. the features that are always enabled by DT CPU
features. That way there are no bits in the ALWAYS mask that are not
also always set by DT CPU features.
Fixes: db5ae1c155af ("powerpc/64s: Refine feature sets for little endian builds")
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is the start of an effort to tidy up and standardise all the
delays. Existing loops have a range of delay/sleep periods from 1ms
to 20ms, and some have no delay. They all loop forever except rtc,
which times out after 10 retries, and that uses 10ms delays. So use
10ms as our standard delay. The OPAL maintainer agrees 10ms is a
reasonable starting point.
The idea is to use the same recipe everywhere, once this is proven to
work then it will be documented as an OPAL API standard. Then both
firmware and OS can agree, and if a particular call needs something
else, then that can be documented with reasoning.
This is not the end-all of this effort, it's just a relatively easy
change that fixes some existing high latency delays. There should be
provision for standardising timeouts and/or interruptible loops where
possible, so non-fatal firmware errors don't cause hangs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If you build the kernel with CONFIG_RELOCATABLE=n, then install the
modules, rebuild the kernel with CONFIG_RELOCATABLE=y and leave the
old modules installed, we crash something like:
Unable to handle kernel paging request for data at address 0xd000000018d66cef
Faulting instruction address: 0xc0000000021ddd08
Oops: Kernel access of bad area, sig: 11 [#1]
Modules linked in: x_tables autofs4
CPU: 2 PID: 1 Comm: systemd Not tainted 4.16.0-rc6-gcc_ubuntu_le-g99fec39 #1
...
NIP check_version.isra.22+0x118/0x170
Call Trace:
__ksymtab_xt_unregister_table+0x58/0xfffffffffffffcb8 [x_tables] (unreliable)
resolve_symbol+0xb4/0x150
load_module+0x10e8/0x29a0
SyS_finit_module+0x110/0x140
system_call+0x58/0x6c
This happens because since commit 71810db27c1c ("modversions: treat
symbol CRCs as 32 bit quantities"), a relocatable kernel encodes and
handles symbol CRCs differently from a non-relocatable kernel.
Although it's possible we could try and detect this situation and
handle it, it's much more robust to simply make the state of
CONFIG_RELOCATABLE part of the module vermagic.
Fixes: 71810db27c1c ("modversions: treat symbol CRCs as 32 bit quantities")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
As arch_kexec_kernel_image_{probe,load}(),
arch_kimage_file_post_load_cleanup() and arch_kexec_kernel_verify_sig()
are almost duplicated among architectures, they can be commonalized with
an architecture-defined kexec_file_ops array. So let's factor them out.
Link: http://lkml.kernel.org/r/20180306102303.9063-3-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull kvm updates from Paolo Bonzini:
"ARM:
- VHE optimizations
- EL2 address space randomization
- speculative execution mitigations ("variant 3a", aka execution past
invalid privilege register access)
- bugfixes and cleanups
PPC:
- improvements for the radix page fault handler for HV KVM on POWER9
s390:
- more kvm stat counters
- virtio gpu plumbing
- documentation
- facilities improvements
x86:
- support for VMware magic I/O port and pseudo-PMCs
- AMD pause loop exiting
- support for AMD core performance extensions
- support for synchronous register access
- expose nVMX capabilities to userspace
- support for Hyper-V signaling via eventfd
- use Enlightened VMCS when running on Hyper-V
- allow userspace to disable MWAIT/HLT/PAUSE vmexits
- usual roundup of optimizations and nested virtualization bugfixes
Generic:
- API selftest infrastructure (though the only tests are for x86 as
of now)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (174 commits)
kvm: x86: fix a prototype warning
kvm: selftests: add sync_regs_test
kvm: selftests: add API testing infrastructure
kvm: x86: fix a compile warning
KVM: X86: Add Force Emulation Prefix for "emulate the next instruction"
KVM: X86: Introduce handle_ud()
KVM: vmx: unify adjacent #ifdefs
x86: kvm: hide the unused 'cpu' variable
KVM: VMX: remove bogus WARN_ON in handle_ept_misconfig
Revert "KVM: X86: Fix SMRAM accessing even if VM is shutdown"
kvm: Add emulation for movups/movupd
KVM: VMX: raise internal error for exception during invalid protected mode state
KVM: nVMX: Optimization: Dont set KVM_REQ_EVENT when VMExit with nested_run_pending
KVM: nVMX: Require immediate-exit when event reinjected to L2 and L1 event pending
KVM: x86: Fix misleading comments on handling pending exceptions
KVM: x86: Rename interrupt.pending to interrupt.injected
KVM: VMX: No need to clear pending NMI/interrupt on inject realmode interrupt
x86/kvm: use Enlightened VMCS when running on Hyper-V
x86/hyper-v: detect nested features
x86/hyper-v: define struct hv_enlightened_vmcs and clean field bits
...
|
| |\
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
KVM PPC update for 4.17
- Improvements for the radix page fault handler for HV KVM on POWER9.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Since commit fb1522e099f0 ("KVM: update to new mmu_notifier semantic
v2", 2017-08-31), the MMU notifier code in KVM no longer calls the
kvm_unmap_hva callback. This removes the PPC implementations of
kvm_unmap_hva().
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch introduces kvm_para_has_hint() to query for hints about
the configuration of the guests. The first hint KVM_HINTS_DEDICATED,
is set if the guest has dedicated physical CPUs for each vCPU (i.e.
pinning and no over-commitment). This allows optimizing spinlocks
and tells the guest to avoid PV TLB flush.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Notable changes:
- Support for 4PB user address space on 64-bit, opt-in via mmap().
- Removal of POWER4 support, which was accidentally broken in 2016
and no one noticed, and blocked use of some modern instructions.
- Workarounds so that the hypervisor can enable Transactional Memory
on Power9.
- A series to disable the DAWR (Data Address Watchpoint Register) on
Power9.
- More information displayed in the meltdown/spectre_v1/v2 sysfs
files.
- A vpermxor (Power8 Altivec) implementation for the raid6 Q
Syndrome.
- A big series to make the allocation of our pacas (per cpu area),
kernel page tables, and per-cpu stacks NUMA aware when using the
Radix MMU on Power9.
And as usual many fixes, reworks and cleanups.
Thanks to: Aaro Koskinen, Alexandre Belloni, Alexey Kardashevskiy,
Alistair Popple, Andy Shevchenko, Aneesh Kumar K.V, Anshuman Khandual,
Balbir Singh, Benjamin Herrenschmidt, Christophe Leroy, Christophe
Lombard, Cyril Bur, Daniel Axtens, Dave Young, Finn Thain, Frederic
Barrat, Gustavo Romero, Horia Geantă, Jonathan Neuschäfer, Kees Cook,
Larry Finger, Laurent Dufour, Laurent Vivier, Logan Gunthorpe,
Madhavan Srinivasan, Mark Greer, Mark Hairgrove, Markus Elfring,
Mathieu Malaterre, Matt Brown, Matt Evans, Mauricio Faria de Oliveira,
Michael Neuling, Naveen N. Rao, Nicholas Piggin, Paul Mackerras,
Philippe Bergheaud, Ram Pai, Rob Herring, Sam Bobroff, Segher
Boessenkool, Simon Guo, Simon Horman, Stewart Smith, Sukadev
Bhattiprolu, Suraj Jitindar Singh, Thiago Jung Bauermann, Vaibhav
Jain, Vaidyanathan Srinivasan, Vasant Hegde, Wei Yongjun"
* tag 'powerpc-4.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (207 commits)
powerpc/64s/idle: Fix restore of AMOR on POWER9 after deep sleep
powerpc/64s: Fix POWER9 DD2.2 and above in cputable features
powerpc/64s: Fix pkey support in dt_cpu_ftrs, add CPU_FTR_PKEY bit
powerpc/64s: Fix dt_cpu_ftrs to have restore_cpu clear unwanted LPCR bits
Revert "powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead"
powerpc: iomap.c: introduce io{read|write}64_{lo_hi|hi_lo}
powerpc: io.h: move iomap.h include so that it can use readq/writeq defs
cxl: Fix possible deadlock when processing page faults from cxllib
powerpc/hw_breakpoint: Only disable hw breakpoint if cpu supports it
powerpc/mm/radix: Update command line parsing for disable_radix
powerpc/mm/radix: Parse disable_radix commandline correctly.
powerpc/mm/hugetlb: initialize the pagetable cache correctly for hugetlb
powerpc/mm/radix: Update pte fragment count from 16 to 256 on radix
powerpc/mm/keys: Update documentation and remove unnecessary check
powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead
powerpc/64s/idle: Consolidate power9_offline_stop()/power9_idle_stop()
powerpc/powernv: Always stop secondaries before reboot/shutdown
powerpc: hard disable irqs in smp_send_stop loop
powerpc: use NMI IPI for smp_send_stop
powerpc/powernv: Fix SMT4 forcing idle code
...
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The CPU_FTR_POWER9_DD2_1 flag is intended to be set for DD2.1 and
above (which is what the dt_cpu_ftrs setup does). Fix cputable for
DD2.2 to match.
This came about due to patches b5af4f279323 ("powerpc: Add CPU feature
bits for TM bug workarounds on POWER9 v2.2"), and 9e9626ed3a4a
("powerpc/64s: Fix POWER9 DD2.2 and above in DT CPU features") being
in-flight at once. The latter patch fixed dt_cpu_ftrs like this one
does. The former changed cputable to match dt_cpu_ftrs.
Fixes: b5af4f279323 ("powerpc: Add CPU feature bits for TM bug workarounds on POWER9 v2.2")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Subsequent patches in this series makes use of the readq and writeq
defines in iomap.h. However, as is, they get missed on the powerpc
platform seeing the include comes before the define. This patch moves
the include down to fix this.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We get the below warning if we try to use kexec on P9:
kexec_core: Starting new kernel
WARNING: CPU: 0 PID: 1223 at arch/powerpc/kernel/process.c:826 __set_breakpoint+0xb4/0x140
[snip]
NIP __set_breakpoint+0xb4/0x140
LR kexec_prepare_cpus_wait+0x58/0x150
Call Trace:
0xc0000000ee70fb20 (unreliable)
0xc0000000ee70fb20
default_machine_kexec+0x234/0x2c0
machine_kexec+0x84/0x90
kernel_kexec+0xd8/0xe0
SyS_reboot+0x214/0x2c0
system_call+0x58/0x6c
This happens since we are trying to clear hw breakpoint on POWER9,
though we don't have CPU_FTR_DAWR enabled. Guard __set_breakpoint()
within hw_breakpoint_disable() with ppc_breakpoint_available() to
address this.
Fixes: 9654153158d3 ("powerpc: Disable DAWR in the base POWER9 CPU features")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
With split PTL (page table lock) config, we allocate the level
4 (leaf) page table using pte fragment framework instead of slab cache
like other levels. This was done to enable us to have split page table
lock at the level 4 of the page table. We use page->plt backing the
all the level 4 pte fragment for the lock.
Currently with Radix, we use only 16 fragments out of the allocated
page. In radix each fragment is 256 bytes which means we use only 4k
out of the allocated 64K page wasting 60k of the allocated memory.
This was done earlier to keep it closer to hash.
This patch update the pte fragment count to 256, thereby using the
full 64K page and reducing the memory usage. Performance tests shows
really low impact even with THP disabled. With THP disabled we will be
contenting further less on level 4 ptl and hence the impact should be
further low.
256 threads:
without patch (10 runs of ./ebizzy -m -n 1000 -s 131072 -S 100)
median = 15678.5
stdev = 42.1209
with patch:
median = 15354
stdev = 194.743
This is with THP disabled. With THP enabled the impact of the patch
will be less.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Currently powernv reboot and shutdown requests just leave secondaries
to do their own things. This is undesirable because they can trigger
any number of watchdogs while waiting for reboot, but also we don't
know what else they might be doing -- they might be causing trouble,
trampling memory, etc.
The opal scheduled flash update code already ran into watchdog problems
due to flashing taking a long time, and it was fixed with 2196c6f1ed
("powerpc/powernv: Return secondary CPUs to firmware before FW update"),
which returns secondaries to opal. It's been found that regular reboots
can take over 10 seconds, which can result in the hard lockup watchdog
firing,
reboot: Restarting system
[ 360.038896709,5] OPAL: Reboot request...
Watchdog CPU:0 Hard LOCKUP
Watchdog CPU:44 detected Hard LOCKUP other CPUS:16
Watchdog CPU:16 Hard LOCKUP
watchdog: BUG: soft lockup - CPU#16 stuck for 3s! [swapper/16:0]
This patch removes the special case for flash update, and calls
smp_send_stop in all cases before calling reboot/shutdown.
smp_send_stop could return CPUs to OPAL, the main reason not to is
that the request could come from a NMI that interrupts OPAL code,
so re-entry to OPAL can cause a number of problems. Putting
secondaries into simple spin loops improves the chances of a
successful reboot.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This moves the definition of the default security feature flags
(i.e., enabled by default) closer to the security feature flags.
This can be used to restore current flags to the default flags.
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Commit 8e0b634b1327 ("powerpc/64s: Do not allocate lppaca if we are
not virtualized") removed allocation of lppaca on bare metal
platforms. But with CONFIG_PPC_SPLPAR enabled, we still access the
lppaca on bare metal in some code paths.
Fix this but adding runtime checks for SPLPAR (shared processor LPAR).
Fixes: 8e0b634b1327 ("powerpc/64s: Do not allocate lppaca if we are not virtualized")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This reduces vmlinux text size by 1kB and data by 1.5kB with a small
build!
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add the recently added CPU_FTRS_POWER9_DD2_2 to the little
endian possible mask as noticed by Nick.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
POWER4 has been broken since at least the change 49d09bf2a6
("powerpc/64s: Optimise MSR handling in exception handling"), which
requires mtmsrd L=1 support. This was introduced in ISA v2.01, and
POWER4 supports ISA v2.00.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The last usage was removed in c17b98cf6028 ("KVM: PPC: Book3S HV:
Remove code for PPC970 processors") (Dec 2014).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Rather than override the machine type in .S code (which can hide wrong
or ambiguous code generation for the target), set the type to power4
for all assembly.
This also means we need to be careful not to build power4-only code
when we're not building for Book3S, such as the "power7" versions of
copyuser/page/memcpy.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fix Book3E build, don't build the "power7" variants for non-Book3S]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
ALTIVEC and VSX features are not added by to default to the POWERx CPU
feature sets because they are intended to be enabled by firmware.
Currently they end up in CPU_FTRS_POSSIBLE due to their inclusion in
other the set for other CPUs, eg. PPC970.
But they should be added individually to the CPU_FTRS_POSSIBLE set,
because if we reduce the set of CPUs that are built-for they may
disappear from the possible mask.
It already contains CPU_FTR_VSX, so add ALTIVEC. The _COMP features
should be used because they won't be present if compiled out.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add detail to change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
It's not a bug to have features missing in CPU_FTR_ALWAYS, but it is a
missed opportunity for optimisation.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Implement a new function to invoke stop, power9_offline_stop, which is
like power9_idle_stop but used by the cpu hotplug code.
Move KVM secondary state manipulation code to the offline case.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| |\ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Bring in yet another series that touches KVM code, and might need to
be merged into the kvm-ppc branch to resolve conflicts.
This required some changes in pnv_power9_force_smt4_catch/release()
due to the paca array becomming an array of pointers.
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Move __map_kernel_page_nid() inside #ifdef SPARSEMEM_VMEMMAP]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Rename the dummy allocate_pacas() to fix 32-bit build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Build an array that finds hardware CPU number from logical CPU
number in firmware CPU discovery. Use that rather than setting
paca of other CPUs directly, to begin with. Subsequent patch will
not have pacas allocated at this point.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fix SMP=n build by adding #ifdef in arch_match_cpu_phys_id()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Split sparsemem initialisation from basic numa topology discovery.
Move the parsing earlier in boot, before pacas are allocated.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We no longer allocate lppacas in an array, so this patch removes the
1kB static alignment for the structure, and enforces the PAPR
alignment requirements at allocation time. We can not reduce the 1kB
allocation size however, due to existing KVM hypervisors.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Change the paca array into an array of pointers to pacas. Allocate
pacas individually.
This allows flexibility in where the PACAs are allocated. Future work
will allocate them node-local. Platforms that don't have address limits
on PACAs would be able to defer PACA allocations until later in boot
rather than allocate all possible ones up-front then freeing unused.
This is slightly more overhead (one additional indirection) for cross
CPU paca references, but those aren't too common.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The "lppaca" is a structure registered with the hypervisor. This is
unnecessary when running on non-virtualised platforms. One field from
the lppaca (pmcregs_in_use) is also used by the host, so move the host
part out into the paca (lppaca field is still updated in
guest mode).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fix non-pseries build with some #ifdefs]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We need to zero-out pgd table only if we share the slab cache with
pud/pmd level caches. With the support of 4PB, we don't share the slab
cache anymore. Instead of removing the code completely hide it within
an #ifdef. We don't need to do this with any other page table level,
because they all allocate table of double the size and we take of
initializing the first half corrrectly during page table zap.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Consolidate multiple #if / #ifdef into one]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This patch increases the max virtual (effective) address value to 4PB.
With 4K page size config we continue to limit ourself to 64TB.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Keep the H_PGTABLE_RANGE test, update it to work]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
For addresses above 512TB we allocate additional mmu contexts. To make
it all easy, addresses above 512TB are handled with IR/DR=1 and with
stack frame setup.
The mmu_context_t is also updated to track the new extended_ids. To
support upto 4PB we need a total 8 contexts.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Minor formatting tweaks and comment wording, switch BUG to WARN
in get_ea_context().]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Memory keys are supported only with hash translation mode. Instead of
using #ifdef in generic code move the key related pte bits to
respective headers
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
asm/barrier.h is not always included after asm/synch.h, which meant
it was missing __SUBARCH_HAS_LWSYNC, so in some files smp_wmb() would
be eieio when it should be lwsync. kernel/time/hrtimer.c is one case.
__SUBARCH_HAS_LWSYNC is only used in one place, so just fold it in
to where it's used. Previously with my small simulator config, 377
instances of eieio in the tree. After this patch there are 55.
Fixes: 46d075be585e ("powerpc: Optimise smp_wmb")
Cc: stable@vger.kernel.org # v2.6.29+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| |\ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Merge our fixes branch from the 4.16 cycle.
There were a number of important fixes merged, in particular some Power9
workarounds that we want in next for testing purposes. There's also been
some conflicting changes in the CPU features code which are best merged
and tested before going upstream.
|
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Merge the DAWR series, which touches arch code and KVM code and may need
to be merged into the kvm-ppc tree.
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Using the DAWR on POWER9 can cause xstops, hence we need to disable
it.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
POWER7 compat mode guests can use h_set_dabr on POWER9. POWER9 should
use the DAWR but since it's disabled there we can't.
This returns H_UNSUPPORTED on a h_set_dabr() on POWER9 where the DAWR
is disabled.
Current Linux guests ignore this error, so they will silently not get
the DAWR (sigh). The same error code is being used by POWERVM in this
case.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Add ppc_breakpoint_available() to determine if a breakpoint is
available currently via the DAWR or DABR.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Checking for a "fully active" device state requires testing two flag
bits, which is open coded in several places, so add a function to do
it.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Currently the EEH_PE_RECOVERING flag for a PE is managed by both the
caller and callee of eeh_handle_normal_event() (among other places not
considered here). This is complicated by the fact that the PE may
or may not have been invalidated by the call.
So move the callee's handling into eeh_handle_normal_event(), which
clarifies it and allows the return type to be changed to void (because
it no longer needs to indicate at the PE has been invalidated).
This should not change behaviour except in eeh_event_handler() where
it was previously possible to cause eeh_pe_state_clear() to be called
on an invalid PE, which is now avoided.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
The function eeh_handle_event(pe) does nothing other than switching
between calling eeh_handle_normal_event(pe) and
eeh_handle_special_event(). However it is only called in two places,
one where pe can't be NULL and the other where it must be NULL (see
eeh_event_handler()) so it does nothing but obscure the flow of
control.
So, remove it.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Now that we have the security feature flags we can make the
information displayed in the "meltdown" file more informative.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
This commit adds security feature flags to reflect the settings we
receive from firmware regarding Spectre/Meltdown mitigations.
The feature names reflect the names we are given by firmware on bare
metal machines. See the hostboot source for details.
Arguably these could be firmware features, but that then requires them
to be read early in boot so they're available prior to asm feature
patching, but we don't actually want to use them for patching. We may
also want to dynamically update them in future, which would be
incompatible with the way firmware features work (at the moment at
least). So for now just make them separate flags.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Add some additional values which have been defined for the
H_GET_CPU_CHARACTERISTICS hypercall.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
For PowerVM migration we want to be able to call setup_rfi_flush()
again after we've migrated the partition.
To support that we need to check that we're not trying to allocate the
fallback flush area after memblock has gone away (i.e., boot-time only).
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|