summaryrefslogtreecommitdiffstats
path: root/drivers/xen
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'x86-entry-2020-06-12' of ↵Linus Torvalds2020-06-133-57/+19
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 entry updates from Thomas Gleixner: "The x86 entry, exception and interrupt code rework This all started about 6 month ago with the attempt to move the Posix CPU timer heavy lifting out of the timer interrupt code and just have lockless quick checks in that code path. Trivial 5 patches. This unearthed an inconsistency in the KVM handling of task work and the review requested to move all of this into generic code so other architectures can share. Valid request and solved with another 25 patches but those unearthed inconsistencies vs. RCU and instrumentation. Digging into this made it obvious that there are quite some inconsistencies vs. instrumentation in general. The int3 text poke handling in particular was completely unprotected and with the batched update of trace events even more likely to expose to endless int3 recursion. In parallel the RCU implications of instrumenting fragile entry code came up in several discussions. The conclusion of the x86 maintainer team was to go all the way and make the protection against any form of instrumentation of fragile and dangerous code pathes enforcable and verifiable by tooling. A first batch of preparatory work hit mainline with commit d5f744f9a2ac ("Pull x86 entry code updates from Thomas Gleixner") That (almost) full solution introduced a new code section '.noinstr.text' into which all code which needs to be protected from instrumentation of all sorts goes into. Any call into instrumentable code out of this section has to be annotated. objtool has support to validate this. Kprobes now excludes this section fully which also prevents BPF from fiddling with it and all 'noinstr' annotated functions also keep ftrace off. The section, kprobes and objtool changes are already merged. The major changes coming with this are: - Preparatory cleanups - Annotating of relevant functions to move them into the noinstr.text section or enforcing inlining by marking them __always_inline so the compiler cannot misplace or instrument them. - Splitting and simplifying the idtentry macro maze so that it is now clearly separated into simple exception entries and the more interesting ones which use interrupt stacks and have the paranoid handling vs. CR3 and GS. - Move quite some of the low level ASM functionality into C code: - enter_from and exit to user space handling. The ASM code now calls into C after doing the really necessary ASM handling and the return path goes back out without bells and whistels in ASM. - exception entry/exit got the equivivalent treatment - move all IRQ tracepoints from ASM to C so they can be placed as appropriate which is especially important for the int3 recursion issue. - Consolidate the declaration and definition of entry points between 32 and 64 bit. They share a common header and macros now. - Remove the extra device interrupt entry maze and just use the regular exception entry code. - All ASM entry points except NMI are now generated from the shared header file and the corresponding macros in the 32 and 64 bit entry ASM. - The C code entry points are consolidated as well with the help of DEFINE_IDTENTRY*() macros. This allows to ensure at one central point that all corresponding entry points share the same semantics. The actual function body for most entry points is in an instrumentable and sane state. There are special macros for the more sensitive entry points, e.g. INT3 and of course the nasty paranoid #NMI, #MCE, #DB and #DF. They allow to put the whole entry instrumentation and RCU handling into safe places instead of the previous pray that it is correct approach. - The INT3 text poke handling is now completely isolated and the recursion issue banned. Aside of the entry rework this required other isolation work, e.g. the ability to force inline bsearch. - Prevent #DB on fragile entry code, entry relevant memory and disable it on NMI, #MC entry, which allowed to get rid of the nested #DB IST stack shifting hackery. - A few other cleanups and enhancements which have been made possible through this and already merged changes, e.g. consolidating and further restricting the IDT code so the IDT table becomes RO after init which removes yet another popular attack vector - About 680 lines of ASM maze are gone. There are a few open issues: - An escape out of the noinstr section in the MCE handler which needs some more thought but under the aspect that MCE is a complete trainwreck by design and the propability to survive it is low, this was not high on the priority list. - Paravirtualization When PV is enabled then objtool complains about a bunch of indirect calls out of the noinstr section. There are a few straight forward ways to fix this, but the other issues vs. general correctness were more pressing than parawitz. - KVM KVM is inconsistent as well. Patches have been posted, but they have not yet been commented on or picked up by the KVM folks. - IDLE Pretty much the same problems can be found in the low level idle code especially the parts where RCU stopped watching. This was beyond the scope of the more obvious and exposable problems and is on the todo list. The lesson learned from this brain melting exercise to morph the evolved code base into something which can be validated and understood is that once again the violation of the most important engineering principle "correctness first" has caused quite a few people to spend valuable time on problems which could have been avoided in the first place. The "features first" tinkering mindset really has to stop. With that I want to say thanks to everyone involved in contributing to this effort. Special thanks go to the following people (alphabetical order): Alexandre Chartre, Andy Lutomirski, Borislav Petkov, Brian Gerst, Frederic Weisbecker, Josh Poimboeuf, Juergen Gross, Lai Jiangshan, Macro Elver, Paolo Bonzin,i Paul McKenney, Peter Zijlstra, Vitaly Kuznetsov, and Will Deacon" * tag 'x86-entry-2020-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (142 commits) x86/entry: Force rcu_irq_enter() when in idle task x86/entry: Make NMI use IDTENTRY_RAW x86/entry: Treat BUG/WARN as NMI-like entries x86/entry: Unbreak __irqentry_text_start/end magic x86/entry: __always_inline CR2 for noinstr lockdep: __always_inline more for noinstr x86/entry: Re-order #DB handler to avoid *SAN instrumentation x86/entry: __always_inline arch_atomic_* for noinstr x86/entry: __always_inline irqflags for noinstr x86/entry: __always_inline debugreg for noinstr x86/idt: Consolidate idt functionality x86/idt: Cleanup trap_init() x86/idt: Use proper constants for table size x86/idt: Add comments about early #PF handling x86/idt: Mark init only functions __init x86/entry: Rename trace_hardirqs_off_prepare() x86/entry: Clarify irq_{enter,exit}_rcu() x86/entry: Remove DBn stacks x86/entry: Remove debug IDT frobbing x86/entry: Optimize local_db_save() for virt ...
| * x86/entry: Convert XEN hypercall vector to IDTENTRY_SYSVECThomas Gleixner2020-06-111-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert the last oldstyle defined vector to IDTENTRY_SYSVEC: - Implement the C entry point with DEFINE_IDTENTRY_SYSVEC - Emit the ASM stub with DECLARE_IDTENTRY_SYSVEC - Remove the ASM idtentries in 64-bit - Remove the BUILD_INTERRUPT entries in 32-bit - Remove the old prototypes Fixup the related XEN code by providing the primary C entry point in x86 to avoid cluttering the generic code with X86'isms. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20200521202119.741950104@linutronix.de
| * x86/entry: Switch XEN/PV hypercall entry to IDTENTRYThomas Gleixner2020-06-112-43/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert the XEN/PV hypercall to IDTENTRY: - Emit the ASM stub with DECLARE_IDTENTRY - Remove the ASM idtentry in 64-bit - Remove the open coded ASM entry code in 32-bit - Remove the old prototypes The handler stubs need to stay in ASM code as they need corner case handling and adjustment of the stack pointer. Provide a new C function which invokes the entry/exit handling and calls into the XEN handler on the interrupt stack if required. The exit code is slightly different from the regular idtentry_exit() on non-preemptible kernels. If the hypercall is preemptible and need_resched() is set then XEN provides a preempt hypercall scheduling function. Move this functionality into the entry code so it can use the existing idtentry functionality. [ mingo: Build fixes. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Andy Lutomirski <luto@kernel.org> Acked-by: Juergen Gross <jgross@suse.com> Tested-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20200521202118.055270078@linutronix.de
| * x86/xen: Split HVM vector callback setup and interrupt gate allocationVitaly Kuznetsov2020-06-111-11/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As a preparatory change for making alloc_intr_gate() __init split xen_callback_vector() into callback vector setup via hypercall (xen_setup_callback_vector()) and interrupt gate allocation (xen_alloc_callback_vector()). xen_setup_callback_vector() is being called twice: on init and upon system resume from xen_hvm_post_suspend(). alloc_intr_gate() only needs to be called once. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200428093824.1451532-2-vkuznets@redhat.com
* | Merge tag 'for-linus-5.8b-rc1-tag' of ↵Linus Torvalds2020-06-1212-123/+78
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - several smaller cleanups - a fix for a Xen guest regression with CPU offlining - a small fix in the xen pvcalls backend driver - an update of MAINTAINERS * tag 'for-linus-5.8b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: MAINTAINERS: Update PARAVIRT_OPS_INTERFACE and VMWARE_HYPERVISOR_INTERFACE xen/pci: Get rid of verbose_request and use dev_dbg() instead xenbus: Use dev_printk() when possible xen-pciback: Use dev_printk() when possible xen: enable BALLOON_MEMORY_HOTPLUG by default xen: expand BALLOON_MEMORY_HOTPLUG description xen/pvcalls: Make pvcalls_back_global static xen/cpuhotplug: Fix initial CPU offlining for PV(H) guests xen-platform: Constify dev_pm_ops xen/pvcalls-back: test for errors when calling backend_connect()
| * xen/pci: Get rid of verbose_request and use dev_dbg() insteadBoris Ostrovsky2020-05-294-50/+21
| | | | | | | | | | | | | | | | | | | | Information printed under verbose_request is clearly used for debugging only. Remove it and use dev_dbg() instead. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/1590719092-8578-1-git-send-email-boris.ostrovsky@oracle.com Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
| * xenbus: Use dev_printk() when possibleBjorn Helgaas2020-05-271-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Use dev_printk() when possible to include device and driver information in the conventional format. Add "#define dev_fmt" to preserve KBUILD_MODNAME in messages. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20200527174326.254329-3-helgaas@kernel.org Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
| * xen-pciback: Use dev_printk() when possibleBjorn Helgaas2020-05-276-83/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use dev_printk() when possible to include device and driver information in the conventional format. Add "#define dev_fmt" when needed to preserve DRV_NAME or KBUILD_MODNAME in messages. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20200527174326.254329-2-helgaas@kernel.org Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
| * xen: enable BALLOON_MEMORY_HOTPLUG by defaultRoger Pau Monne2020-05-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without it a PVH dom0 is mostly useless, as it would balloon down huge amounts of RAM in order get physical address space to map foreign memory and grants, ultimately leading to an out of memory situation. Such option is also needed for HVM or PVH driver domains, since they also require mapping grants into physical memory regions. Suggested-by: Ian Jackson <ian.jackson@eu.citrix.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20200324150015.50496-2-roger.pau@citrix.com Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
| * xen: expand BALLOON_MEMORY_HOTPLUG descriptionRoger Pau Monne2020-05-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | To mention it's also useful for PVH or HVM domains that require mapping foreign memory or grants. [boris: "non PV" instead of "translated" at Juergen's request] Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20200324150015.50496-1-roger.pau@citrix.com Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
| * xen/pvcalls: Make pvcalls_back_global staticYueHaibing2020-05-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Fix sparse warning: drivers/xen/pvcalls-back.c:30:3: warning: symbol 'pvcalls_back_global' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Link: https://lore.kernel.org/r/20200410115620.33024-1-yuehaibing@huawei.com Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
| * xen/cpuhotplug: Fix initial CPU offlining for PV(H) guestsBoris Ostrovsky2020-05-211-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a926f81d2f6c ("xen/cpuhotplug: Replace cpu_up/down() with device_online/offline()") replaced cpu_down() with device_offline() call which requires that the CPU has been registered before. This registration, however, happens later from topology_init() which is called as subsys_initcall(). setup_vcpu_hotplug_event(), on the other hand, is invoked earlier, during arch_initcall(). As result, booting a PV(H) guest with vcpus < maxvcpus causes a crash. Move setup_vcpu_hotplug_event() (and therefore setup_cpu_watcher()) to late_initcall(). In addition, instead of performing all offlining steps in setup_cpu_watcher() simply call disable_hotplug_cpu(). Fixes: a926f81d2f6c (xen/cpuhotplug: Replace cpu_up/down() with device_online/offline()" Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/1588976923-3667-1-git-send-email-boris.ostrovsky@oracle.com Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
| * xen-platform: Constify dev_pm_opsRikard Falkeborn2020-05-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dev_pm_ops is never modified, so mark it const to allow the compiler to put it in read-only memory. Before: text data bss dec hex filename 2457 1668 256 4381 111d drivers/xen/platform-pci.o After: text data bss dec hex filename 2681 1444 256 4381 111d drivers/xen/platform-pci.o Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20200509134755.15038-1-rikard.falkeborn@gmail.com Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
| * xen/pvcalls-back: test for errors when calling backend_connect()Juergen Gross2020-05-211-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | backend_connect() can fail, so switch the device to connected only if no error occurred. Fixes: 0a9c75c2c7258f2 ("xen/pvcalls: xenbus state handling") Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20200511074231.19794-1-jgross@suse.com Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* | Merge branch 'rwonce/rework' of ↵Linus Torvalds2020-06-101-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/will/linux Pull READ/WRITE_ONCE rework from Will Deacon: "This the READ_ONCE rework I've been working on for a while, which bumps the minimum GCC version and improves code-gen on arm64 when stack protector is enabled" [ Side note: I'm _really_ tempted to raise the minimum gcc version to 4.9, so that we can just say that we require _Generic() support. That would allow us to more cleanly handle a lot of the cases where we depend on very complex macros with 'sizeof' or __builtin_choose_expr() with __builtin_types_compatible_p() etc. This branch has a workaround for sparse not handling _Generic(), either, but that was already fixed in the sparse development branch, so it's really just gcc-4.9 that we'd require. - Linus ] * 'rwonce/rework' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux: compiler_types.h: Use unoptimized __unqual_scalar_typeof for sparse compiler_types.h: Optimize __unqual_scalar_typeof compilation time compiler.h: Enforce that READ_ONCE_NOCHECK() access size is sizeof(long) compiler-types.h: Include naked type in __pick_integer_type() match READ_ONCE: Fix comment describing 2x32-bit atomicity gcov: Remove old GCC 3.4 support arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros READ_ONCE: Drop pointer qualifiers when reading from scalar types READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE() arm64: csum: Disable KASAN for do_csum() fault_inject: Don't rely on "return value" from WRITE_ONCE() net: tls: Avoid assigning 'const' pointer to non-const pointer netfilter: Avoid assigning 'const' pointer to non-const pointer compiler/gcc: Raise minimum GCC version for kernel builds to 4.8
| * | READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accessesWill Deacon2020-04-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | {READ,WRITE}_ONCE() cannot guarantee atomicity for arbitrary data sizes. This can be surprising to callers that might incorrectly be expecting atomicity for accesses to aggregate structures, although there are other callers where tearing is actually permissable (e.g. if they are using something akin to sequence locking to protect the access). Linus sayeth: | We could also look at being stricter for the normal READ/WRITE_ONCE(), | and require that they are | | (a) regular integer types | | (b) fit in an atomic word | | We actually did (b) for a while, until we noticed that we do it on | loff_t's etc and relaxed the rules. But maybe we could have a | "non-atomic" version of READ/WRITE_ONCE() that is used for the | questionable cases? The slight snag is that we also have to support 64-bit accesses on 32-bit architectures, as these appear to be widespread and tend to work out ok if either the architecture supports atomic 64-bit accesses (x86, armv7) or if the variable being accesses represents a virtual address and therefore only requires 32-bit atomicity in practice. Take a step in that direction by introducing a variant of 'compiletime_assert_atomic_type()' and use it to check the pointer argument to {READ,WRITE}_ONCE(). Expose __{READ,WRITE}_ONCE() variants which are allowed to tear and convert the one broken caller over to the new macros. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Will Deacon <will@kernel.org>
* | | mmap locking API: convert mmap_sem commentsMichel Lespinasse2020-06-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert comments that reference mmap_sem to reference mmap_lock instead. [akpm@linux-foundation.org: fix up linux-next leftovers] [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil] [akpm@linux-foundation.org: more linux-next fixups, per Michel] Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | mmap locking API: use coccinelle to convert mmap_sem rwsem call sitesMichel Lespinasse2020-06-092-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change converts the existing mmap_sem rwsem calls to use the new mmap locking API instead. The change is generated using coccinelle with the following rule: // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir . @@ expression mm; @@ ( -init_rwsem +mmap_init_lock | -down_write +mmap_write_lock | -down_write_killable +mmap_write_lock_killable | -down_write_trylock +mmap_write_trylock | -up_write +mmap_write_unlock | -downgrade_write +mmap_write_downgrade | -down_read +mmap_read_lock | -down_read_killable +mmap_read_lock_killable | -down_read_trylock +mmap_read_trylock | -up_read +mmap_read_unlock ) -(&mm->mmap_sem) +(mm) Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | mm: don't include asm/pgtable.h if linux/mm.h is already includedMike Rapoport2020-06-096-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "mm: consolidate definitions of page table accessors", v2. The low level page table accessors (pXY_index(), pXY_offset()) are duplicated across all architectures and sometimes more than once. For instance, we have 31 definition of pgd_offset() for 25 supported architectures. Most of these definitions are actually identical and typically it boils down to, e.g. static inline unsigned long pmd_index(unsigned long address) { return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1); } static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address) { return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address); } These definitions can be shared among 90% of the arches provided XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined. For architectures that really need a custom version there is always possibility to override the generic version with the usual ifdefs magic. These patches introduce include/linux/pgtable.h that replaces include/asm-generic/pgtable.h and add the definitions of the page table accessors to the new header. This patch (of 12): The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the functions involving page table manipulations, e.g. pte_alloc() and pmd_alloc(). So, there is no point to explicitly include <asm/pgtable.h> in the files that include <linux/mm.h>. The include statements in such cases are remove with a simple loop: for f in $(git grep -l "include <linux/mm.h>") ; do sed -i -e '/include <asm\/pgtable.h>/ d' $f done Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | xen/privcmd: Remove unneeded asm/tlb.h includeThomas Gleixner2020-04-261-1/+0
| |/ |/| | | | | | | | | | | | | Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200421092600.236617960@linutronix.de
* | Merge tag 'for-linus-5.7-rc2-tag' of ↵Linus Torvalds2020-04-171-1/+8
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen update from Juergen Gross: - a small cleanup patch - a security fix for a bug in the Xen hypervisor to avoid enabling Xen guests to crash dom0 on an unfixed hypervisor. * tag 'for-linus-5.7-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: arm/xen: make _xen_start_info static xen/xenbus: ensure xenbus_map_ring_valloc() returns proper grant status
| * xen/xenbus: ensure xenbus_map_ring_valloc() returns proper grant statusJuergen Gross2020-04-141-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xenbus_map_ring_valloc() maps a ring page and returns the status of the used grant (0 meaning success). There are Xen hypervisors which might return the value 1 for the status of a failed grant mapping due to a bug. Some callers of xenbus_map_ring_valloc() test for errors by testing the returned status to be less than zero, resulting in no error detected and crashing later due to a not available ring page. Set the return value of xenbus_map_ring_valloc() to GNTST_general_error in case the grant status reported by Xen is greater than zero. This is part of XSA-316. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Wei Liu <wl@xen.org> Link: https://lore.kernel.org/r/20200326080358.1018-1-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
* | Merge tag 'for-linus-5.7-rc1b-tag' of ↵Linus Torvalds2020-04-1012-102/+113
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull more xen updates from Juergen Gross: - two cleanups - fix a boot regression introduced in this merge window - fix wrong use of memory allocation flags * tag 'for-linus-5.7-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: fix booting 32-bit pv guest x86/xen: make xen_pvmmu_arch_setup() static xen/blkfront: fix memory allocation flags in blkfront_setup_indirect() xen: Use evtchn_type_t as a type for event channels
| * xen: Use evtchn_type_t as a type for event channelsYan Yankovskyi2020-04-0712-102/+113
| | | | | | | | | | | | | | | | | | | | Make event channel functions pass event channel port using evtchn_port_t type. It eliminates signed <-> unsigned conversion. Signed-off-by: Yan Yankovskyi <yyankovskyi@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20200323152343.GA28422@kbp1-lhp-F74019 Signed-off-by: Juergen Gross <jgross@suse.com>
* | Merge tag 'for-linus-5.7-rc1-tag' of ↵Linus Torvalds2020-04-033-89/+47
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - a cleanup patch removing an unused function - a small fix for the xen pciback driver - a series for making the unwinder hyppay with the Xen PV guest idle task stacks * tag 'for-linus-5.7-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: Make the secondary CPU idle tasks reliable x86/xen: Make the boot CPU idle task reliable xen-pciback: fix INTERRUPT_TYPE_* defines xen/xenbus: remove unused xenbus_map_ring()
| * xen-pciback: fix INTERRUPT_TYPE_* definesMarek Marczykowski-Górecki2020-03-302-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xen_pcibk_get_interrupt_type() assumes INTERRUPT_TYPE_NONE being 0 (initialize ret to 0 and return as INTERRUPT_TYPE_NONE). Fix the definition to make INTERRUPT_TYPE_NONE really 0, and also shift other values to not leave holes. But also, do not assume INTERRUPT_TYPE_NONE being 0 anymore to avoid similar confusions in the future. Fixes: 476878e4b2be ("xen-pciback: optionally allow interrupt enable flag writes") Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
| * xen/xenbus: remove unused xenbus_map_ring()Juergen Gross2020-03-301-84/+42
| | | | | | | | | | | | | | | | | | | | xenbus_map_ring() is used nowhere in the tree, remove it. xenbus_unmap_ring() is used only locally, so make it static and move it up. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
* | Merge tag 'smp-core-2020-03-30' of ↵Linus Torvalds2020-03-301-1/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core SMP updates from Thomas Gleixner: "CPU (hotplug) updates: - Support for locked CSD objects in smp_call_function_single_async() which allows to simplify callsites in the scheduler core and MIPS - Treewide consolidation of CPU hotplug functions which ensures the consistency between the sysfs interface and kernel state. The low level functions cpu_up/down() are now confined to the core code and not longer accessible from random code" * tag 'smp-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) cpu/hotplug: Ignore pm_wakeup_pending() for disable_nonboot_cpus() cpu/hotplug: Hide cpu_up/down() cpu/hotplug: Move bringup of secondary CPUs out of smp_init() torture: Replace cpu_up/down() with add/remove_cpu() firmware: psci: Replace cpu_up/down() with add/remove_cpu() xen/cpuhotplug: Replace cpu_up/down() with device_online/offline() parisc: Replace cpu_up/down() with add/remove_cpu() sparc: Replace cpu_up/down() with add/remove_cpu() powerpc: Replace cpu_up/down() with add/remove_cpu() x86/smp: Replace cpu_up/down() with add/remove_cpu() arm64: hibernate: Use bringup_hibernate_cpu() cpu/hotplug: Provide bringup_hibernate_cpu() arm64: Use reboot_cpu instead of hardconding it to 0 arm64: Don't use disable_nonboot_cpus() ARM: Use reboot_cpu instead of hardcoding it to 0 ARM: Don't use disable_nonboot_cpus() ia64: Replace cpu_down() with smp_shutdown_nonboot_cpus() cpu/hotplug: Create a new function to shutdown nonboot cpus cpu/hotplug: Add new {add,remove}_cpu() functions sched/core: Remove rq.hrtick_csd_pending ...
| * xen/cpuhotplug: Replace cpu_up/down() with device_online/offline()Qais Yousef2020-03-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The core device API performs extra housekeeping bits that are missing from directly calling cpu_up/down(). See commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and serialization during LPM") for an example description of what might go wrong. This also prepares to make cpu_up/down() a private interface of the cpu subsystem. Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lkml.kernel.org/r/20200323135110.30522-14-qais.yousef@arm.com
* | xen/xenbus: fix lockingJuergen Gross2020-03-052-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 060eabe8fbe726 ("xenbus/backend: Protect xenbus callback with lock") introduced a bug by holding a lock while calling a function which might schedule. Fix that by using a semaphore instead. Fixes: 060eabe8fbe726 ("xenbus/backend: Protect xenbus callback with lock") Signed-off-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20200305100323.16736-1-jgross@suse.com Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* | xenbus: req->err should be updated before req->stateDongli Zhang2020-03-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the barrier to guarantee that req->err is always updated before req->state. Otherwise, read_reply() would not return ERR_PTR(req->err) but req->body, when process_writes()->xb_write() is failed. Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Link: https://lore.kernel.org/r/20200303221423.21962-2-dongli.zhang@oracle.com Reviewed-by: Julien Grall <jgrall@amazon.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* | xenbus: req->body should be updated before req->stateDongli Zhang2020-03-052-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The req->body should be updated before req->state is updated and the order should be guaranteed by a barrier. Otherwise, read_reply() might return req->body = NULL. Below is sample callstack when the issue is reproduced on purpose by reordering the updates of req->body and req->state and adding delay in code between updates of req->state and req->body. [ 22.356105] general protection fault: 0000 [#1] SMP PTI [ 22.361185] CPU: 2 PID: 52 Comm: xenwatch Not tainted 5.5.0xen+ #6 [ 22.366727] Hardware name: Xen HVM domU, BIOS ... [ 22.372245] RIP: 0010:_parse_integer_fixup_radix+0x6/0x60 ... ... [ 22.392163] RSP: 0018:ffffb2d64023fdf0 EFLAGS: 00010246 [ 22.395933] RAX: 0000000000000000 RBX: 75746e7562755f6d RCX: 0000000000000000 [ 22.400871] RDX: 0000000000000000 RSI: ffffb2d64023fdfc RDI: 75746e7562755f6d [ 22.405874] RBP: 0000000000000000 R08: 00000000000001e8 R09: 0000000000cdcdcd [ 22.410945] R10: ffffb2d6402ffe00 R11: ffff9d95395eaeb0 R12: ffff9d9535935000 [ 22.417613] R13: ffff9d9526d4a000 R14: ffff9d9526f4f340 R15: ffff9d9537654000 [ 22.423726] FS: 0000000000000000(0000) GS:ffff9d953bc80000(0000) knlGS:0000000000000000 [ 22.429898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 22.434342] CR2: 000000c4206a9000 CR3: 00000001ea3fc002 CR4: 00000000001606e0 [ 22.439645] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 22.444941] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 22.450342] Call Trace: [ 22.452509] simple_strtoull+0x27/0x70 [ 22.455572] xenbus_transaction_start+0x31/0x50 [ 22.459104] netback_changed+0x76c/0xcc1 [xen_netfront] [ 22.463279] ? find_watch+0x40/0x40 [ 22.466156] xenwatch_thread+0xb4/0x150 [ 22.469309] ? wait_woken+0x80/0x80 [ 22.472198] kthread+0x10e/0x130 [ 22.474925] ? kthread_park+0x80/0x80 [ 22.477946] ret_from_fork+0x35/0x40 [ 22.480968] Modules linked in: xen_kbdfront xen_fbfront(+) xen_netfront xen_blkfront [ 22.486783] ---[ end trace a9222030a747c3f7 ]--- [ 22.490424] RIP: 0010:_parse_integer_fixup_radix+0x6/0x60 The virt_rmb() is added in the 'true' path of test_reply(). The "while" is changed to "do while" so that test_reply() is used as a read memory barrier. Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Link: https://lore.kernel.org/r/20200303221423.21962-1-dongli.zhang@oracle.com Reviewed-by: Julien Grall <jgrall@amazon.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* | xen: Replace zero-length array with flexible-array memberGustavo A. R. Silva2020-03-051-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://lore.kernel.org/r/20200226212612.GA4663@embeddedor Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* Merge tag 'for-linus-5.6-rc3-tag' of ↵Linus Torvalds2020-02-211-1/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Two small fixes for Xen: - a fix to avoid warnings with new gcc - a fix for incorrectly disabled interrupts when calling _cond_resched()" * tag 'for-linus-5.6-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Enable interrupts when calling _cond_resched() x86/xen: Distribute switch variables for initialization
| * xen: Enable interrupts when calling _cond_resched()Thomas Gleixner2020-02-201-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xen_maybe_preempt_hcall() is called from the exception entry point xen_do_hypervisor_callback with interrupts disabled. _cond_resched() evades the might_sleep() check in cond_resched() which would have caught that and schedule_debug() unfortunately lacks a check for irqs_disabled(). Enable interrupts around the call and use cond_resched() to catch future issues. Fixes: fdfd811ddde3 ("x86/xen: allow privcmd hypercalls to be preempted") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/878skypjrh.fsf@nanos.tec.linutronix.de Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* | Merge tag 'for-linus-5.6-rc1-tag' of ↵Linus Torvalds2020-02-0510-14/+277
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - fix a bug introduced in 5.5 in the Xen gntdev driver - fix the Xen balloon driver when running on ancient Xen versions - allow Xen stubdoms to control interrupt enable flags of passed-through PCI cards - release resources in Xen backends under memory pressure * tag 'for-linus-5.6-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/blkback: Consistently insert one empty line between functions xen/blkback: Remove unnecessary static variable name prefixes xen/blkback: Squeeze page pools if a memory pressure is detected xenbus/backend: Protect xenbus callback with lock xenbus/backend: Add memory pressure handler callback xen/gntdev: Do not use mm notifiers with autotranslating guests xen/balloon: Support xend-based toolstack take two xen-pciback: optionally allow interrupt enable flag writes
| * xenbus/backend: Protect xenbus callback with lockSeongJae Park2020-01-292-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | A driver's 'reclaim_memory' callback can race with 'probe' or 'remove' because it will be called whenever memory pressure is detected. To avoid such race, this commit embeds a spinlock in each 'xenbus_device' and make 'xenbus' to hold the lock while the corresponded callbacks are running. Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: SeongJae Park <sjpark@amazon.de> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
| * xenbus/backend: Add memory pressure handler callbackSeongJae Park2020-01-291-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Granting pages consumes backend system memory. In systems configured with insufficient spare memory for those pages, it can cause a memory pressure situation. However, finding the optimal amount of the spare memory is challenging for large systems having dynamic resource utilization patterns. Also, such a static configuration might lack flexibility. To mitigate such problems, this commit adds a memory reclaim callback to 'xenbus_driver'. If a memory pressure is detected, 'xenbus' requests every backend driver to volunarily release its memory. Note that it would be able to improve the callback facility for more sophisticated handlings of general pressures. For example, it would be possible to monitor the memory consumption of each device and issue the release requests to only devices which causing the pressure. Also, the callback could be extended to handle not only memory, but general resources. Nevertheless, this version of the implementation defers such sophisticated goals as a future work. Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: SeongJae Park <sjpark@amazon.de> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
| * xen/gntdev: Do not use mm notifiers with autotranslating guestsBoris Ostrovsky2020-01-281-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d3eeb1d77c5d ("xen/gntdev: use mmu_interval_notifier_insert") missed a test for use_ptemod when calling mmu_interval_read_begin(). Fix that. Fixes: d3eeb1d77c5d ("xen/gntdev: use mmu_interval_notifier_insert") CC: stable@vger.kernel.org # 5.5 Reported-by: Ilpo Järvinen <ilpo.jarvinen@cs.helsinki.fi> Tested-by: Ilpo Järvinen <ilpo.jarvinen@cs.helsinki.fi> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Juergen Gross <jgross@suse.com>
| * xen/balloon: Support xend-based toolstack take twoJuergen Gross2020-01-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3aa6c19d2f38be ("xen/balloon: Support xend-based toolstack") tried to fix a regression with running on rather ancient Xen versions. Unfortunately the fix was based on the assumption that xend would just use another Xenstore node, but in reality only some downstream versions of xend are doing that. The upstream xend does not write that Xenstore node at all, so the problem must be fixed in another way. The easiest way to achieve that is to fall back to the behavior before commit 96edd61dcf4436 ("xen/balloon: don't online new memory initially") in case the static memory maximum can't be read. This is achieved by setting static_max to the current number of memory pages known by the system resulting in target_diff becoming zero. Fixes: 3aa6c19d2f38be ("xen/balloon: Support xend-based toolstack") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: <stable@vger.kernel.org> # 4.13 Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
| * xen-pciback: optionally allow interrupt enable flag writesMarek Marczykowski-Górecki2020-01-156-0/+219
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | QEMU running in a stubdom needs to be able to set INTX_DISABLE, and the MSI(-X) enable flags in the PCI config space. This adds an attribute 'allow_interrupt_control' which when set for a PCI device allows writes to this flag(s). The toolstack will need to set this for stubdoms. When enabled, guest (stubdomain) will be allowed to set relevant enable flags, but only one at a time - i.e. it refuses to enable more than one of INTx, MSI, MSI-X at a time. This functionality is needed only for config space access done by device model (stubdomain) serving a HVM with the actual PCI device. It is not necessary and unsafe to enable direct access to those bits for PV domain with the device attached. For PV domains, there are separate protocol messages (XEN_PCI_OP_{enable,disable}_{msi,msix}) for this purpose. Those ops in addition to setting enable bits, also configure MSI(-X) in dom0 kernel - which is undesirable for PCI passthrough to HVM guests. This should not introduce any new security issues since a malicious guest (or stubdom) can already generate MSIs through other ways, see [1] page 8. Additionally, when qemu runs in dom0, it already have direct access to those bits. This is the second iteration of this feature. First was proposed as a direct Xen interface through a new hypercall, but ultimately it was rejected by the maintainer, because of mixing pciback and hypercalls for PCI config space access isn't a good design. Full discussion at [2]. [1]: https://invisiblethingslab.com/resources/2011/Software%20Attacks%20on%20Intel%20VT-d.pdf [2]: https://xen.markmail.org/thread/smpgpws4umdzizze [part of the commit message and sysfs handling] Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> [the rest] Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> [boris: A few small changes suggested by Roger, some formatting changes] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* | Merge tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drmLinus Torvalds2020-01-301-23/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm updates from Davbe Airlie: "This is the main pull request for graphics for 5.6. Usual selection of changes all over. I've got one outstanding vmwgfx pull that touches mm so kept it separate until after all of this lands. I'll try and get it to you soon after this, but it might be early next week (nothing wrong with code, just my schedule is messy) This also hits a lot of fbdev drivers with some cleanups. Other notables: - vulkan timeline semaphore support added to syncobjs - nouveau turing secureboot/graphics support - Displayport MST display stream compression support Detailed summary: uapi: - dma-buf heaps added (and fixed) - command line add support for panel oreientation - command line allow overriding penguin count drm: - mipi dsi definition updates - lockdep annotations for dma_resv - remove dma-buf kmap/kunmap support - constify fb_ops in all fbdev drivers - MST fix for daisy chained hotplug- - CTA-861-G modes with VIC >= 193 added - fix drm_panel_of_backlight export - LVDS decoder support - more device based logging support - scanline alighment for dumb buffers - MST DSC helpers scheduler: - documentation fixes - job distribution improvements panel: - Logic PD type 28 panel support - Jimax8729d MIPI-DSI - igenic JZ4770 - generic DSI devicetree bindings - sony acx424AKP panel - Leadtek LTK500HD1829 - xinpeng XPP055C272 - AUO B116XAK01 - GiantPlus GPM940B0 - BOE NV140FHM-N49 - Satoz SAT050AT40H12R2 - Sharp LS020B1DD01D panels. ttm: - use blocking WW lock i915: - hw/uapi state separation - Lock annotation improvements - selftest improvements - ICL/TGL DSI VDSC support - VBT parsing improvments - Display refactoring - DSI updates + fixes - HDCP 2.2 for CFL - CML PCI ID fixes - GLK+ fbc fix - PSR fixes - GEN/GT refactor improvments - DP MST fixes - switch context id alloc to xarray - workaround updates - LMEM debugfs support - tiled monitor fixes - ICL+ clock gating programming removed - DP MST disable sequence fixed - LMEM discontiguous object maps - prefaulting for discontiguous objects - use LMEM for dumb buffers if possible - add LMEM mmap support amdgpu: - enable sync object timelines for vulkan - MST atomic routines - enable MST DSC support - add DMCUB display microengine support - DC OEM i2c support - Renoir DC fixes - Initial HDCP 2.x support - BACO support for Arcturus - Use BACO for runtime PM power save - gfxoff on navi10 - gfx10 golden updates and fixes - DCN support on POWER - GFXOFF for raven1 refresh - MM engine idle handlers cleanup - 10bpc EDP panel fixes - renoir watermark fixes - SR-IOV fixes - Arcturus VCN fixes - GDDR6 training fixes - freesync fixes - Pollock support amdkfd: - unify more codepath with amdgpu - use KIQ to setup HIQ rather than MMIO radeon: - fix vma fault handler race - PPC DMA fix - register check fixes for r100/r200 nouveau: - mmap_sem vs dma_resv fix - rewrite the ACR secure boot code for Turing - TU10x graphics engine support (TU11x pending) - Page kind mapping for turing - 10-bit LUT support - GP10B Tegra fixes - HD audio regression fix hisilicon/hibmc: - use generic fbdev code and helpers rockchip: - dsi/px30 support virtio: - fb damage support - static some functions vc4: - use dma_resv lock wrappers msm: - use dma_resv lock wrappers - sc7180 display + DSI support - a618 support - UBWC support improvements vmwgfx: - updates + new logging uapi exynos: - enable/disable callback cleanups etnaviv: - use dma_resv lock wrappers atmel-hlcdc: - clock fixes mediatek: - cmdq support - non-smooth cursor fixes - ctm property support sun4i: - suspend support - A64 mipi dsi support rcar-du: - Color management module support - LVDS encoder dual-link support - R8A77980 support analogic: - add support for an6345 ast: - atomic modeset support - primary plane garbage fix arcgpu: - fixes for fourcc handling tegra: - minor fixes and improvments mcde: - vblank support meson: - OSD1 plane AFBC commit gma500: - add pageflip support - reomve global drm_dev komeda: - tweak debugfs output - d32 support - runtime PM suppotr udl: - use generic shmem helpers - cleanup and fixes" * tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm: (1998 commits) drm/nouveau/fb/gp102-: allow module to load even when scrubber binary is missing drm/nouveau/acr: return error when registering LSF if ACR not supported drm/nouveau/disp/gv100-: not all channel types support reporting error codes drm/nouveau/disp/nv50-: prevent oops when no channel method map provided drm/nouveau: support synchronous pushbuf submission drm/nouveau: signal pending fences when channel has been killed drm/nouveau: reject attempts to submit to dead channels drm/nouveau: zero vma pointer even if we only unreference it rather than free drm/nouveau: Add HD-audio component notifier support drm/nouveau: fix build error without CONFIG_IOMMU_API drm/nouveau/kms/nv04: remove set but not used variable 'width' drm/nouveau/kms/nv50: remove set but not unused variable 'nv_connector' drm/nouveau/mmu: fix comptag memory leak drm/nouveau/gr/gp10b: Use gp100_grctx and gp100_gr_zbc drm/nouveau/pmu/gm20b,gp10b: Fix Falcon bootstrapping drm/exynos: Rename Exynos to lowercase drm/exynos: change callback names drm/mst: Don't do atomic checks over disabled managers drm/amdgpu: add the lost mutex_init back drm/amd/display: skip opp blank or unblank if test pattern enabled ...
| * | Backmerge v5.5-rc7 into drm-nextDave Airlie2020-01-205-33/+33
| |\| | | | | | | | | | | | | | | | msm needs 5.5-rc4, go to the latest. Signed-off-by: Dave Airlie <airlied@redhat.com>
| * | Merge tag 'drm-misc-next-2019-12-16' of ↵Daniel Vetter2019-12-171-23/+0
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.6: UAPI Changes: - Add support for DMA-BUF HEAPS. Cross-subsystem Changes: - mipi dsi definition updates, pulled into drm-intel as well. - Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim. - Remove support for dma-buf kmap/kunmap. - Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well. Core Changes: - Small cleanups to ttm. - Fix SCDC definition. - Assorted cleanups to core. - Add todo to remove load/unload hooks, and use generic fbdev emulation. - Assorted documentation updates. - Use blocking ww lock in ttm fault handler. - Remove drm_fb_helper_fbdev_setup/teardown. - Warning fixes with W=1 for atomic. - Use drm_debug_enabled() instead of drm_debug flag testing in various drivers. - Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted) - Various kconfig indentation fixes in core and drivers. - Fix freeing transactions in dp-mst correctly. - Sean Paul is steping down as core maintainer. :-( - Add lockdep annotations for atomic locks vs dma-resv. - Prevent use-after-free for a bad job in drm_scheduler. - Fill out all block sizes in the P01x and P210 definitions. - Avoid division by zero in drm/rect, and fix bounds. - Add drm/rect selftests. - Add aspect ratio and alternate clocks for HDMI 4k modes. - Add todo for drm_framebuffer_funcs and fb_create cleanup. - Drop DRM_AUTH for prime import/export ioctls. - Clear DP-MST payload id tables downstream when initializating. - Fix for DSC throughput definition. - Add extra FEC definitions. - Fix fake offset in drm_gem_object_funs.mmap. - Stop using encoder->bridge in core directly - Handle bridge chaining slightly better. - Add backlight support to drm/panel, and use it in many panel drivers. - Increase max number of y420 modes from 128 to 256, as preparation to add the new modes. Driver Changes: - Small fixes all over. - Fix documentation in vkms. - Fix mmap_sem vs dma_resv in nouveau. - Small cleanup in komeda. - Add page flip support in gma500 for psb/cdv. - Add ddc symlink in the connector sysfs directory for many drivers. - Add support for analogic an6345, and fix small bugs in it. - Add atomic modesetting support to ast. - Fix radeon fault handler VMA race. - Switch udl to use generic shmem helpers. - Unconditional vblank handling for mcde. - Miscellaneous fixes to mcde. - Tweak debug output from komeda using debugfs. - Add gamma and color transform support to komeda for DOU-IPS. - Add support for sony acx424AKP panel. - Various small cleanups to gma500. - Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation. - Add support for Logic PD Type 28 panel. - Use drm_panel_* wrapper functions in exynos/tegra/msm. - Add devicetree bindings for generic DSI panels. - Don't include drm_pci.h directly in many drivers. - Add support for begin/end_cpu_access in udmabuf. - Stop using drm_get_pci_dev in gma500 and mga200. - Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access. - Add devfreq thermal support to panfrost. - Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager. - meson: Add support for OSD1 plane AFBC commit. - Stop displaying garbage when toggling ast primary plane on/off. - More cleanups and fixes to UDL. - Add D32 suport to komeda. - Remove globle copy of drm_dev in gma500. - Add support for Boe Himax8279d MIPI-DSI LCD panel. - Add support for ingenic JZ4770 panel. - Small null pointer deference fix in ingenic. - Remove support for the special tfp420 driver, as there is a generic way to do it. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ba73535a-9334-5302-2e1f-5208bd7390bd@linux.intel.com
| | * | xen/gntdev-dmabuf: Ditch dummy map functionsDaniel Vetter2019-11-251-23/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's no in-kernel users for the k(un)map stuff. And the mmap one is actively harmful - return 0 and then _not_ actually mmaping can't end well. Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Reviewed-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: xen-devel@lists.xenproject.org Link: https://patchwork.freedesktop.org/patch/msgid/20191118103536.17675-14-daniel.vetter@ffwll.ch
* | | | Merge tag 'v5.5-rc3' into sched/core, to pick up fixesIngo Molnar2019-12-2510-96/+79
|\ \ \ \ | | |_|/ | |/| | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | Merge tag 'for-linus-5.5b-rc3-tag' of ↵Linus Torvalds2019-12-215-33/+33
| |\ \ \ | | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "This contains two cleanup patches and a small series for supporting reloading the Xen block backend driver" * tag 'for-linus-5.5b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/grant-table: remove multiple BUG_ON on gnttab_interface xen-blkback: support dynamic unbind/bind xen/interface: re-define FRONT/BACK_RING_ATTACH() xenbus: limit when state is forced to closed xenbus: move xenbus_dev_shutdown() into frontend code... xen/blkfront: Adjust indentation in xlvbd_alloc_gendisk
| | * | xen/grant-table: remove multiple BUG_ON on gnttab_interfaceAditya Pakki2019-12-201-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gnttab_request_version() always sets the gnttab_interface variable and the assertions to check for empty gnttab_interface is unnecessary. The patch eliminates multiple such assertions. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
| | * | xenbus: limit when state is forced to closedPaul Durrant2019-12-201-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a driver probe() fails then leave the xenstore state alone. There is no reason to modify it as the failure may be due to transient resource allocation issues and hence a subsequent probe() may succeed. If the driver supports re-binding then only force state to closed during remove() only in the case when the toolstack may need to clean up. This can be detected by checking whether the state in xenstore has been set to closing prior to device removal. NOTE: Re-bind support is indicated by new boolean in struct xenbus_driver, which defaults to false. Subsequent patches will add support to some backend drivers. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
| | * | xenbus: move xenbus_dev_shutdown() into frontend code...Paul Durrant2019-12-204-27/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ...and make it static xenbus_dev_shutdown() is seemingly intended to cause clean shutdown of PV frontends when a guest is rebooted. Indeed the function waits for a conpletion which is only set by a call to xenbus_frontend_closed(). This patch removes the shutdown() method from backends and moves xenbus_dev_shutdown() from xenbus_probe.c into xenbus_probe_frontend.c, renaming it appropriately and making it static. NOTE: In the case where the backend is running in a driver domain, the toolstack should have already terminated any frontends that may be using it (since Xen does not support re-startable PV driver domains) so xenbus_dev_shutdown() should never be called. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>