summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/mm/fault.c
Commit message (Collapse)AuthorAgeFilesLines
* signal/powerpc: Use force_sig_fault where appropriateEric W. Biederman2018-09-211-8/+1
| | | | | Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* signal/powerpc: Specialize _exception_pkey for handling pkey exceptionsEric W. Biederman2018-09-211-1/+1
| | | | | | | | | | Now that _exception no longer calls _exception_pkey it is no longer necessary to handle any signal with any si_code. All pkey exceptions are SIGSEGV with paired with SEGV_PKUERR. So just handle that case and remove the now unnecessary parameters from _exception_pkey. Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* signal/powerpc: Remove pkey parameter from __bad_area_nosemaphoreEric W. Biederman2018-09-211-5/+4
| | | | | | | | Now that bad_key_fault_exception no longer calls __bad_area_nosemaphore there is no reason for __bad_area_nosemaphore to handle pkeys. Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* signal/powerpc: Call _exception_pkey directly from bad_key_fault_exceptionEric W. Biederman2018-09-211-1/+11
| | | | | | | This removes the need for other code paths to deal with pkey exceptions. Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* signal/powerpc: Remove pkey parameter from __bad_areaEric W. Biederman2018-09-211-5/+4
| | | | | | | | There are no callers of __bad_area that pass in a pkey parameter so it makes no sense to take one. Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* signal/powerpc: Use force_sig_mceerr as appropriateEric W. Biederman2018-09-211-7/+11
| | | | | | | | | In do_sigbus isolate the mceerr signaling code and call force_sig_mceerr instead of falling through to the force_sig_info that works for all of the other signals. Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* Merge branch 'akpm' (patches from Andrew)Linus Torvalds2018-08-171-3/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge updates from Andrew Morton: - a few misc things - a few Y2038 fixes - ntfs fixes - arch/sh tweaks - ocfs2 updates - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (111 commits) mm/hmm.c: remove unused variables align_start and align_end fs/userfaultfd.c: remove redundant pointer uwq mm, vmacache: hash addresses based on pmd mm/list_lru: introduce list_lru_shrink_walk_irq() mm/list_lru.c: pass struct list_lru_node* as an argument to __list_lru_walk_one() mm/list_lru.c: move locking from __list_lru_walk_one() to its caller mm/list_lru.c: use list_lru_walk_one() in list_lru_walk_node() mm, swap: make CONFIG_THP_SWAP depend on CONFIG_SWAP mm/sparse: delete old sparse_init and enable new one mm/sparse: add new sparse_init_nid() and sparse_init() mm/sparse: move buffer init/fini to the common place mm/sparse: use the new sparse buffer functions in non-vmemmap mm/sparse: abstract sparse buffer allocations mm/hugetlb.c: don't zero 1GiB bootmem pages mm, page_alloc: double zone's batchsize mm/oom_kill.c: document oom_lock mm/hugetlb: remove gigantic page support for HIGHMEM mm, oom: remove sleep from under oom_lock kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous() mm/cma: remove unsupported gfp_mask parameter from cma_alloc() ...
| * mm: convert return type of handle_mm_fault() caller to vm_fault_tSouptick Joarder2018-08-171-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Ref-> commit 1c8f422059ae ("mm: change return type to vm_fault_t") In this patch all the caller of handle_mm_fault() are changed to return vm_fault_t type. Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Tony Luck <tony.luck@intel.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: James Hogan <jhogan@kernel.org> Cc: Ley Foon Tan <lftan@altera.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: David S. Miller <davem@davemloft.net> Cc: Richard Weinberger <richard@nod.at> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | powerpc: remove unnecessary inclusion of asm/tlbflush.hChristophe Leroy2018-07-301-1/+0
|/ | | | | | | | | | asm/tlbflush.h is only needed for: - using functions xxx_flush_tlb_xxx() - using MMU_NO_CONTEXT - including asm-generic/pgtable.h Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* Merge tag 'powerpc-4.18-1' of ↵Linus Torvalds2018-06-071-29/+47
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - Support for split PMD page table lock on 64-bit Book3S (Power8/9). - Add support for HAVE_RELIABLE_STACKTRACE, so we properly support live patching again. - Add support for patching barrier_nospec in copy_from_user() and syscall entry. - A couple of fixes for our data breakpoints on Book3S. - A series from Nick optimising TLB/mm handling with the Radix MMU. - Numerous small cleanups to squash sparse/gcc warnings from Mathieu Malaterre. - Several series optimising various parts of the 32-bit code from Christophe Leroy. - Removal of support for two old machines, "SBC834xE" and "C2K" ("GEFanuc,C2K"), which is why the diffstat has so many deletions. And many other small improvements & fixes. There's a few out-of-area changes. Some minor ftrace changes OK'ed by Steve, and a fix to our powernv cpuidle driver. Then there's a series touching mm, x86 and fs/proc/task_mmu.c, which cleans up some details around pkey support. It was ack'ed/reviewed by Ingo & Dave and has been in next for several weeks. Thanks to: Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Balbir Singh, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian King, Dave Hansen, Fabio Estevam, Finn Thain, Frederic Barrat, Gautham R. Shenoy, Haren Myneni, Hari Bathini, Ingo Molnar, Jonathan Neuschäfer, Josh Poimboeuf, Kamalesh Babulal, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Greer, Mathieu Malaterre, Matthew Wilcox, Michael Neuling, Michal Suchanek, Naveen N. Rao, Nicholas Piggin, Nicolai Stange, Olof Johansson, Paul Gortmaker, Paul Mackerras, Peter Rosin, Pridhiviraj Paidipeddi, Ram Pai, Rashmica Gupta, Ravi Bangoria, Russell Currey, Sam Bobroff, Samuel Mendoza-Jonas, Segher Boessenkool, Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stewart Smith, Thiago Jung Bauermann, Torsten Duwe, Vaibhav Jain, Wei Yongjun, Wolfram Sang, Yisheng Xie, YueHaibing" * tag 'powerpc-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (251 commits) powerpc/64s/radix: Fix missing ptesync in flush_cache_vmap cpuidle: powernv: Fix promotion from snooze if next state disabled powerpc: fix build failure by disabling attribute-alias warning in pci_32 ocxl: Fix missing unlock on error in afu_ioctl_enable_p9_wait() powerpc-opal: fix spelling mistake "Uniterrupted" -> "Uninterrupted" powerpc: fix spelling mistake: "Usupported" -> "Unsupported" powerpc/pkeys: Detach execute_only key on !PROT_EXEC powerpc/powernv: copy/paste - Mask SO bit in CR powerpc: Remove core support for Marvell mv64x60 hostbridges powerpc/boot: Remove core support for Marvell mv64x60 hostbridges powerpc/boot: Remove support for Marvell mv64x60 i2c controller powerpc/boot: Remove support for Marvell MPSC serial controller powerpc/embedded6xx: Remove C2K board support powerpc/lib: optimise PPC32 memcmp powerpc/lib: optimise 32 bits __clear_user() powerpc/time: inline arch_vtime_task_switch() powerpc/Makefile: set -mcpu=860 flag for the 8xx powerpc: Implement csum_ipv6_magic in assembly powerpc/32: Optimise __csum_partial() powerpc/lib: Adjust .balign inside string functions for PPC32 ...
| * powerpc/mm: Only read faulting instruction when necessary in do_page_fault()Christophe Leroy2018-05-251-16/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a7a9dcd882a67 ("powerpc: Avoid taking a data miss on every userspace instruction miss") has shown that limiting the read of faulting instruction to likely cases improves performance. This patch goes further into this direction by limiting the read of the faulting instruction to the only cases where it is likely needed. On an MPC885, with the same benchmark app as in the commit referred above, we see a reduction of about 3900 dTLB misses (approx 3%): Before the patch: Performance counter stats for './fault 500' (10 runs): 683033312 cpu-cycles ( +- 0.03% ) 134538 dTLB-load-misses ( +- 0.03% ) 46099 iTLB-load-misses ( +- 0.02% ) 19681 faults ( +- 0.02% ) 5.389747878 seconds time elapsed ( +- 0.06% ) With the patch: Performance counter stats for './fault 500' (10 runs): 682112862 cpu-cycles ( +- 0.03% ) 130619 dTLB-load-misses ( +- 0.03% ) 46073 iTLB-load-misses ( +- 0.05% ) 19681 faults ( +- 0.01% ) 5.381342641 seconds time elapsed ( +- 0.07% ) The proper work of the huge stack expansion was tested with the following app: int main(int argc, char **argv) { char buf[1024 * 1025]; sprintf(buf, "Hello world !\n"); printf(buf); exit(0); } Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Add include of pagemap.h to fix build errors] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/mm: Use instruction symbolic names in store_updates_sp()Christophe Leroy2018-05-251-13/+13
| | | | | | | | | | | | | | | | Use symbolic names defined in asm/ppc-opcode.h instead of hardcoded values. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | signal: Ensure every siginfo we send has all bits initializedEric W. Biederman2018-04-251-0/+1
|/ | | | | | | | | | | | | | | | | | | | | | | Call clear_siginfo to ensure every stack allocated siginfo is properly initialized before being passed to the signal sending functions. Note: It is not safe to depend on C initializers to initialize struct siginfo on the stack because C is allowed to skip holes when initializing a structure. The initialization of struct siginfo in tracehook_report_syscall_exit was moved from the helper user_single_step_siginfo into tracehook_report_syscall_exit itself, to make it clear that the local variable siginfo gets fully initialized. In a few cases the scope of struct siginfo has been reduced to make it clear that siginfo siginfo is not used on other paths in the function in which it is declared. Instances of using memset to initialize siginfo have been replaced with calls clear_siginfo for clarity. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* powerpc/mm/keys: Update documentation and remove unnecessary checkAneesh Kumar K.V2018-04-041-16/+12
| | | | | | | | Adds more code comments. We also remove an unnecessary pkey check after we check for pkey error in this patch. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* Merge branch 'fixes' into nextMichael Ellerman2018-01-211-1/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | Merge our fixes branch from the 4.15 cycle. Unusually the fixes branch saw some significant features merged, notably the RFI flush patches, so we want the code in next to be tested against that, to avoid any surprises when the two are merged. There's also some other work on the panic handling that was reverted in fixes and we now want to do properly in next, which would conflict. And we also fix a few other minor merge conflicts.
| * powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERRJohn Sperbeck2018-01-021-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The recent refactoring of the powerpc page fault handler in commit c3350602e876 ("powerpc/mm: Make bad_area* helper functions") caused access to protected memory regions to indicate SEGV_MAPERR instead of the traditional SEGV_ACCERR in the si_code field of a user-space signal handler. This can confuse debug libraries that temporarily change the protection of memory regions, and expect to use SEGV_ACCERR as an indication to restore access to a region. This commit restores the previous behavior. The following program exhibits the issue: $ ./repro read || echo "FAILED" $ ./repro write || echo "FAILED" $ ./repro exec || echo "FAILED" #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <signal.h> #include <sys/mman.h> #include <assert.h> static void segv_handler(int n, siginfo_t *info, void *arg) { _exit(info->si_code == SEGV_ACCERR ? 0 : 1); } int main(int argc, char **argv) { void *p = NULL; struct sigaction act = { .sa_sigaction = segv_handler, .sa_flags = SA_SIGINFO, }; assert(argc == 2); p = mmap(NULL, getpagesize(), (strcmp(argv[1], "write") == 0) ? PROT_READ : 0, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); assert(p != MAP_FAILED); assert(sigaction(SIGSEGV, &act, NULL) == 0); if (strcmp(argv[1], "read") == 0) printf("%c", *(unsigned char *)p); else if (strcmp(argv[1], "write") == 0) *(unsigned char *)p = 0; else if (strcmp(argv[1], "exec") == 0) ((void (*)(void))p)(); return 1; /* failed to generate SEGV */ } Fixes: c3350602e876 ("powerpc/mm: Make bad_area* helper functions") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: John Sperbeck <jsperbeck@google.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Add commit references in change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Deliver SEGV signal on pkey violationRam Pai2018-01-201-12/+27
| | | | | | | | | | | | | | | | | | The value of the pkey, whose protection got violated, is made available in si_pkey field of the siginfo structure. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Handle exceptions caused by pkey violationRam Pai2018-01-201-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Handle Data and Instruction exceptions caused by memory protection-key. The CPU will detect the key fault if the HPTE is already programmed with the key. However if the HPTE is not hashed, a key fault will not be detected by the hardware. The software will detect pkey violation in such a case. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Use the TRAP macro whenever comparing a trap numberBenjamin Herrenschmidt2018-01-161-1/+1
|/ | | | | | | | | | Trap numbers can have extra bits at the bottom that need to be filtered out. There are a few cases where we don't do that. It's possible that we got lucky but better safe than sorry. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/8xx: Use symbolic names for DSISR bits in DSIChristophe Leroy2017-08-101-1/+1
| | | | | | | Use symbolic names for DSISR bits in DSI Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/8xx: Getting rid of remaining use of CONFIG_8xxChristophe Leroy2017-08-101-1/+1
| | | | | | | | | | | | | | | | Two config options exist to define powerpc MPC8xx: * CONFIG_PPC_8xx * CONFIG_8xx arch/powerpc/platforms/Kconfig.cputype has contained the following comment about CONFIG_8xx item for some years: "# this is temp to handle compat with arch=ppc" arch/powerpc is now the only place with remaining use of CONFIG_8xx: get rid of them. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Remove old unused icswx based coprocessor supportBenjamin Herrenschmidt2017-08-031-15/+0
| | | | | | | | | | | | | | | | We have a whole pile of unused code to maintain the ACOP register, allocate coprocessor PIDs and handle ACOP faults. This mechanism was used for the HFI adapter on POWER7 which is dead and gone and whose driver never went upstream. It was used on some A2 core based stuff that also never saw the light of day. Take out all that code. There is still some POWER8 coprocessor code that uses icswx but it's kernel only and thus doesn't use any of that infrastructure. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Cleanup check for stack expansionBenjamin Herrenschmidt2017-08-031-36/+48
| | | | | | | | | | | | When hitting below a VM_GROWSDOWN vma (typically growing the stack), we check whether it's a valid stack-growing instruction and we check the distance to GPR1. This is largely open coded with lots of comments, so move it out to a helper. While at it, make store_update_sp a boolean. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Don't lose "major" fault indication on retryBenjamin Herrenschmidt2017-08-031-2/+3
| | | | | | | | | | If the first iteration returns VM_FAULT_MAJOR but the second one doesn't, we fail to account the fault as a major fault. This fixes it and brings the code in line with x86. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Move page fault VMA access checks to a helperBenjamin Herrenschmidt2017-08-031-24/+33
| | | | | Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Set fault flags earlierBenjamin Herrenschmidt2017-08-031-1/+4
| | | | | | | | | Move out the code that sets FAULT_FLAG_WRITE so the block that check access permissions can be extracted. While at it also set FAULT_FLAG_INSTRUCTION which will be used for protection keys. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Add a bunch of (un)likely annotations to do_page_faultBenjamin Herrenschmidt2017-08-031-10/+10
| | | | | | | Mostly for the failure cases Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Move/simplify faulthandler_disabled() and !mm checkBenjamin Herrenschmidt2017-08-031-14/+13
| | | | | | | | Do the check before we re-enable interrupts and clean the code up a bit. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Move the DSISR_PROTFAULT sanity checkBenjamin Herrenschmidt2017-08-031-33/+42
| | | | | | | | | | This has a page of comment explaining what's going on right in the middle of do_page_fault() which makes things a bit hard to follow. Move it to a helper instead. Also do the test earlier as there's no point waiting until after we found the VMA. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Cosmetic fix to page fault accountingBenjamin Herrenschmidt2017-08-031-4/+2
| | | | | | | No need to break those lines, they aren't that long Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Move CMO accounting out of do_page_fault into a helperBenjamin Herrenschmidt2017-08-031-11/+18
| | | | | | | It makes do_page_fault() more readable. No functional change. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Rework mm_fault_error()Benjamin Herrenschmidt2017-08-031-38/+28
| | | | | | | | | | | | | First, handle the normal retry failure in do_page_fault itself, since it's a simple return statement. That allows us to remove the "continue" special return code from mm_fault_error(). Once that's done, we can have an implementation much closer to x86 where we only call mm_fault_error() if VM_FAULT_ERROR is set and directly return. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Make bad_area* helper functionsBenjamin Herrenschmidt2017-08-031-28/+50
| | | | | | | | | | | | | | | Instead of goto labels, instead call those functions and return. This gets us closer to x86 and allows us to shring do_page_fault() even more. The main difference with x86 is that those function return a value which we then return from do_page_fault(). That value is our return value from do_page_fault() which we use to generate kernel faults. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Fix reporting of kernel execute faultsBenjamin Herrenschmidt2017-08-031-6/+15
| | | | | | | | | | | | | | | | | | | | We currently test for is_exec and DSISR_PROTFAULT but that doesn't make sense as this is the wrong error bit to test for an execute permission failure. In fact, we had code that would return early if we had an exec fault in kernel mode so I think that was just dead code anyway. Finally the location of that test is awkward and prevents further simplifications. So instead move that test into a helper along with the existing early test for kernel exec faults and out of range accesses, and put it all in a "bad_kernel_fault()" helper. While at it test the correct error bits. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Simplify returns from __do_page_faultBenjamin Herrenschmidt2017-08-031-23/+16
| | | | | | | | Now that we moved the exception state handling to a wrapper, we can just directly return rather than "goto bail" Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Move debugger check to notify_page_fault()Benjamin Herrenschmidt2017-08-031-13/+8
| | | | | | | unclutters the main path Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Overhaul handling of bad page faultsBenjamin Herrenschmidt2017-08-031-18/+14
| | | | | | | | | | | | | A bad page fault is when the HW signals an error such as a bad copy/paste, an AMO error, or some other type of error that will not be fixed by updating the PTE. Use a helper page_fault_is_bad() to check for bad page faults thus removing the per-processor family open-coding in __do_page_fault() and trigger a SIGBUS rather than a SIGSEGV which is more appropriate. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Move error_code checks for bad faults earlierBenjamin Herrenschmidt2017-08-031-15/+20
| | | | | | | | | | | There's no point looking for the VMA etc.. when we already know we are going to fail. This adds some code to set "code" for the si_code but that will be gone in subsequent patches. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Move out definition of CPU specific is_write bitsBenjamin Herrenschmidt2017-08-031-7/+11
| | | | | | | Define a common page_fault_is_write() helper and use it Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/6xx: Handle DABR match before calling do_page_faultBenjamin Herrenschmidt2017-08-031-9/+0
| | | | | | | | | | | On legacy 6xx 32-bit procesors, we checked for the DABR match bit in DSISR from do_page_fault(), in the middle of a pile of ifdef's because all other CPU types do it in assembly prior to calling do_page_fault. Fix that. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Add #ifdef CONFIG_6xx] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Pre-filter SRR1 bits before do_page_fault()Benjamin Herrenschmidt2017-08-021-12/+2
| | | | | | | | | | | By filtering the relevant SRR1 bits in the assembly rather than in do_page_fault() itself, we avoid a conditional branch (since we already come from different path for data and instruction faults). This will allow more simplifications later Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Move exception_enter/exit to a do_page_fault wrapperBenjamin Herrenschmidt2017-08-021-3/+11
| | | | | | | This will allow simplifying the returns from do_page_fault Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: The 8xx doesn't call do_page_fault() for breakpointsChristophe Leroy2017-06-021-1/+1
| | | | | | | | The 8xx has a dedicated exception for breakpoints, that directly calls do_break() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Evaluate user_mode(regs) only once in do_page_fault()Christophe Leroy2017-06-021-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Analysis of the assembly code shows that when using user_mode(regs), at least the 'andi.' is redone all the time, and also the 'lwz ,132(r31)' most of the time. With the new form, the 'is_user' is mapped to cr4, then all further use of is_user results in just things like 'beq cr4,218 <do_page_fault+0x218>' Without the patch: 50: 81 1e 00 84 lwz r8,132(r30) 54: 71 09 40 00 andi. r9,r8,16384 58: 40 82 00 0c bne 64 <do_page_fault+0x64> 84: 81 3e 00 84 lwz r9,132(r30) 8c: 71 2a 40 00 andi. r10,r9,16384 90: 41 a2 01 64 beq 1f4 <do_page_fault+0x1f4> d4: 81 3e 00 84 lwz r9,132(r30) dc: 71 28 40 00 andi. r8,r9,16384 e0: 41 82 02 08 beq 2e8 <do_page_fault+0x2e8> 108: 81 3e 00 84 lwz r9,132(r30) 110: 71 28 40 00 andi. r8,r9,16384 118: 41 82 02 28 beq 340 <do_page_fault+0x340> 1e4: 81 3e 00 84 lwz r9,132(r30) 1e8: 71 2a 40 00 andi. r10,r9,16384 1ec: 40 82 01 68 bne 354 <do_page_fault+0x354> 228: 81 3e 00 84 lwz r9,132(r30) 22c: 71 28 40 00 andi. r8,r9,16384 230: 41 82 ff c4 beq 1f4 <do_page_fault+0x1f4> 288: 71 2a 40 00 andi. r10,r9,16384 294: 41 a2 fe 60 beq f4 <do_page_fault+0xf4> 50c: 81 3e 00 84 lwz r9,132(r30) 514: 71 2a 40 00 andi. r10,r9,16384 518: 40 a2 fc e0 bne 1f8 <do_page_fault+0x1f8> 534: 81 3e 00 84 lwz r9,132(r30) 53c: 71 2a 40 00 andi. r10,r9,16384 540: 41 82 fc b8 beq 1f8 <do_page_fault+0x1f8> This patch creates a local var called 'is_user' which contains the result of user_mode(regs) With the patch: 20: 81 03 00 84 lwz r8,132(r3) 48: 55 09 97 fe rlwinm r9,r8,18,31,31 58: 2e 09 00 00 cmpwi cr4,r9,0 5c: 40 92 00 0c bne cr4,68 <do_page_fault+0x68> 88: 41 b2 01 90 beq cr4,218 <do_page_fault+0x218> d4: 40 92 01 d0 bne cr4,2a4 <do_page_fault+0x2a4> 120: 41 b2 00 f8 beq cr4,218 <do_page_fault+0x218> 138: 41 b2 ff a0 beq cr4,d8 <do_page_fault+0xd8> 1d4: 40 92 00 e0 bne cr4,2b4 <do_page_fault+0x2b4> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Remove a redundant test in do_page_fault()Christophe Leroy2017-06-021-1/+1
| | | | | | | The result of (trap == 0x400) is already in is_exec. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Only call store_updates_sp() on stores in do_page_fault()Christophe Leroy2017-06-021-1/+1
| | | | | | | | | | | | | | | | | | | Function store_updates_sp() checks whether the faulting instruction is a store updating r1. Therefore we can limit its calls to store exceptions. This patch is an improvement of commit a7a9dcd882a67 ("powerpc: Avoid taking a data miss on every userspace instruction miss") With the same microbenchmark app, run with 500 as argument, on an MPC885 we get: Before this patch: 152000 DTLB misses After this patch: 147000 DTLB misses Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Avoid taking a data miss on every userspace instruction missAnton Blanchard2017-04-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Early on in do_page_fault() we call store_updates_sp(), regardless of the type of exception. For an instruction miss this doesn't make sense, because we only use this information to detect if a data miss is the result of a stack expansion instruction or not. Worse still, it results in a data miss within every userspace instruction miss handler, because we try and load the very instruction we are about to install a pte for! A simple exec microbenchmark runs 6% faster on POWER8 with this fix: #include <stdlib.h> #include <stdio.h> #include <unistd.h> int main(int argc, char *argv[]) { unsigned long left = atol(argv[1]); char leftstr[16]; if (left-- == 0) return 0; sprintf(leftstr, "%ld", left); execlp(argv[0], argv[0], leftstr, NULL); perror("exec failed\n"); return 0; } Pass the number of iterations on the command line (eg 10000) and time how long it takes to execute. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Move mmap_sem unlocking in do_page_fault()Laurent Dufour2017-03-211-15/+4
| | | | | | | | | | Since the fault retry is now handled earlier, we can release the mmap_sem lock earlier too and remove later unlocking previously done in mm_fault_error(). Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Handle VM_FAULT_RETRY earlierLaurent Dufour2017-03-211-29/+38
| | | | | | | | | | | | In do_page_fault() if handle_mm_fault() returns VM_FAULT_RETRY, retry the page fault handling before anything else. This would simplify the handling of the mmap_sem lock in this part of the code. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Move mmap_sem unlock up from do_sigbusLaurent Dufour2017-03-211-3/+3
| | | | | | | | | | Move mmap_sem releasing in the do_sigbus()'s unique caller : mm_fault_error() No functional changes. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>