summaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/cpu/mtrr/main.c
Commit message (Collapse)AuthorAgeFilesLines
* x86/mtrr: Rename main.c to mtrr.c and remove duplicate prefixesJoe Perches2018-05-131-887/+0
| | | | | | | | | | | | | | Kbuild uses the first file as the name for KBUILD_MODNAME. mtrr uses main.c as its first file, so rename that file to mtrr.c and fixup the Makefile. Remove the now duplicate "mtrr: " prefixes from the logging calls. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/ae1fa81a0d1fad87571967b91ea90f70f486e853.1525964384.git.joe@perches.com
* x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_steppingJia Zhang2018-02-151-2/+2
| | | | | | | | | | | | | | | | | x86_mask is a confusing name which is hard to associate with the processor's stepping. Additionally, correct an indent issue in lib/cpu.c. Signed-off-by: Jia Zhang <qianyue.zj@alibaba-inc.com> [ Updated it to more recent kernels. ] Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: tony.luck@intel.com Link: http://lkml.kernel.org/r/1514771530-70829-1-git-send-email-qianyue.zj@alibaba-inc.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86/mtrr: Prevent CPU hotplug lock recursionThomas Gleixner2017-08-151-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Larry reported a CPU hotplug lock recursion in the MTRR code. ============================================ WARNING: possible recursive locking detected systemd-udevd/153 is trying to acquire lock: (cpu_hotplug_lock.rw_sem){.+.+.+}, at: [<c030fc26>] stop_machine+0x16/0x30 but task is already holding lock: (cpu_hotplug_lock.rw_sem){.+.+.+}, at: [<c0234353>] mtrr_add_page+0x83/0x470 .... cpus_read_lock+0x48/0x90 stop_machine+0x16/0x30 mtrr_add_page+0x18b/0x470 mtrr_add+0x3e/0x70 mtrr_add_page() holds the hotplug rwsem already and calls stop_machine() which acquires it again. Call stop_machine_cpuslocked() instead. Reported-and-tested-by: Larry Finger <Larry.Finger@lwfinger.net> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1708140920250.1865@nanos Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@suse.de>
* x86/mtrr: Remove get_online_cpus() from mtrr_save_state()Sebastian Andrzej Siewior2017-05-261-2/+0
| | | | | | | | | | | | | | | | | | | | mtrr_save_state() is invoked from native_cpu_up() which is in the context of a CPU hotplug operation and therefor calling get_online_cpus() is pointless. While this works in the current get_online_cpus() implementation it prevents from converting the hotplug locking to percpu rwsems. Remove it. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170524081547.651378834@linutronix.de
* x86/boot/e820: Move asm/e820.h to asm/e820/api.hIngo Molnar2017-01-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In line with asm/e820/types.h, move the e820 API declarations to asm/e820/api.h and update all usage sites. This is just a mechanical, obviously correct move & replace patch, there will be subsequent changes to clean up the code and to make better use of the new header organization. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86: Apply more __ro_after_init and constKees Cook2016-08-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Guided by grsecurity's analogous __read_only markings in arch/x86, this applies several uses of __ro_after_init to structures that are only updated during __init, and const for some structures that are never updated. Additionally extends __init markings to some functions that are only used during __init, and cleans up some missing C99 style static initializers. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Brown <david.brown@linaro.org> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Emese Revfy <re.emese@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathias Krause <minipli@googlemail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: PaX Team <pageexec@freemail.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20160808232906.GA29731@www.outflux.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86: Audit and remove any remaining unnecessary uses of module.hPaul Gortmaker2016-07-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. In the case of some of these which are modular, we can extend that to also include files that are building basic support functionality but not related to loading or registering the final module; such files also have no need whatsoever for module.h The advantage in removing such instances is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each instance for the presence of either and replace as needed. In the case of crypto/glue_helper.c we delete a redundant instance of MODULE_LICENSE in order to delete module.h -- the license info is already present at the top of the file. The uncore change warrants a mention too; it is uncore.c that uses module.h and not uncore.h; hence the relocation done there. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160714001901.31603-9-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86/mtrr: Fix PAT init handling when MTRR is disabledToshi Kani2016-03-291-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | get_mtrr_state() calls pat_init() on BSP even if MTRR is disabled. This results in calling pat_init() on BSP only since APs do not call pat_init() when MTRR is disabled. This inconsistency between BSP and APs leads to undefined behavior. Make BSP's calling condition to pat_init() consistent with AP's, mtrr_ap_init() and mtrr_aps_init(). Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Toshi Kani <toshi.kani@hp.com> Cc: elliott@hpe.com Cc: konrad.wilk@oracle.com Cc: paul.gortmaker@windriver.com Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1458769323-24491-6-git-send-email-toshi.kani@hpe.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86/mtrr: Fix Xorg crashes in Qemu sessionsToshi Kani2016-03-291-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A Xorg failure on qemu32 was reported as a regression [1] caused by commit 9cd25aac1f44 ("x86/mm/pat: Emulate PAT when it is disabled"). This patch fixes the Xorg crash. Negative effects of this regression were the following two failures [2] in Xorg on QEMU with QEMU CPU model "qemu32" (-cpu qemu32), which were triggered by the fact that its virtual CPU does not support MTRRs. #1. copy_process() failed in the check in reserve_pfn_range() copy_process copy_mm dup_mm dup_mmap copy_page_range track_pfn_copy reserve_pfn_range A WC map request was tracked as WC in memtype, which set a PTE as UC (pgprot) per __cachemode2pte_tbl[]. This led to this error in reserve_pfn_range() called from track_pfn_copy(), which obtained a pgprot from a PTE. It converts pgprot to page_cache_mode, which does not necessarily result in the original page_cache_mode since __cachemode2pte_tbl[] redirects multiple types to UC. #2. error path in copy_process() then hit WARN_ON_ONCE in untrack_pfn(). x86/PAT: Xorg:509 map pfn expected mapping type uncached- minus for [mem 0xfd000000-0xfdffffff], got write-combining Call Trace: dump_stack warn_slowpath_common ? untrack_pfn ? untrack_pfn warn_slowpath_null untrack_pfn ? __kunmap_atomic unmap_single_vma ? pagevec_move_tail_fn unmap_vmas exit_mmap mmput copy_process.part.47 _do_fork SyS_clone do_syscall_32_irqs_on entry_INT80_32 These negative effects are caused by two separate bugs, but they can be addressed in separate patches. Fixing the pat_init() issue described below addresses the root cause, and avoids Xorg to hit these cases. When the CPU does not support MTRRs, MTRR does not call pat_init(), which leaves PAT enabled without initializing PAT. This pat_init() issue is a long-standing issue, but manifested as issue #1 (and then hit issue #2) with the above-mentioned commit because the memtype now tracks cache attribute with 'page_cache_mode'. This pat_init() issue existed before the commit, but we used pgprot in memtype. Hence, we did not have issue #1 before. But WC request resulted in WT in effect because WC pgrot is actually WT when PAT is not initialized. This is not how it was designed to work. When PAT is set to disable properly, WC is converted to UC. The use of WT can result in a system crash if the target range does not support WT. Fortunately, nobody ran into such issue before. To fix this pat_init() issue, PAT code has been enhanced to provide pat_disable() interface. Call this interface when MTRRs are disabled. By setting PAT to disable properly, PAT bypasses the memtype check, and avoids issue #1. [1]: https://lkml.org/lkml/2016/3/3/828 [2]: https://lkml.org/lkml/2016/3/4/775 Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Toshi Kani <toshi.kani@hp.com> Cc: elliott@hpe.com Cc: konrad.wilk@oracle.com Cc: paul.gortmaker@windriver.com Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1458769323-24491-5-git-send-email-toshi.kani@hpe.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds2016-03-151-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "This is another big update. Main changes are: - lots of x86 system call (and other traps/exceptions) entry code enhancements. In particular the complex parts of the 64-bit entry code have been migrated to C code as well, and a number of dusty corners have been refreshed. (Andy Lutomirski) - vDSO special mapping robustification and general cleanups (Andy Lutomirski) - cpufeature refactoring, cleanups and speedups (Borislav Petkov) - lots of other changes ..." * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) x86/cpufeature: Enable new AVX-512 features x86/entry/traps: Show unhandled signal for i386 in do_trap() x86/entry: Call enter_from_user_mode() with IRQs off x86/entry/32: Change INT80 to be an interrupt gate x86/entry: Improve system call entry comments x86/entry: Remove TIF_SINGLESTEP entry work x86/entry/32: Add and check a stack canary for the SYSENTER stack x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup x86/entry: Only allocate space for tss_struct::SYSENTER_stack if needed x86/entry: Vastly simplify SYSENTER TF (single-step) handling x86/entry/traps: Clear DR6 early in do_debug() and improve the comment x86/entry/traps: Clear TIF_BLOCKSTEP on all debug exceptions x86/entry/32: Restore FLAGS on SYSEXIT x86/entry/32: Filter NT and speed up AC filtering in SYSENTER x86/entry/compat: In SYSENTER, sink AC clearing below the existing FLAGS test selftests/x86: In syscall_nt, test NT|TF as well x86/asm-offsets: Remove PARAVIRT_enabled x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled uprobes: __create_xol_area() must nullify xol_mapping.fault x86/cpufeature: Create a new synthetic cpu capability for machine check recovery ...
| * x86/cpufeature: Carve out X86_FEATURE_*Borislav Petkov2016-01-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move them to a separate header and have the following dependency: x86/cpufeatures.h <- x86/processor.h <- x86/cpufeature.h This makes it easier to use the header in asm code and not include the whole cpufeature.h and add guards for asm. Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1453842730-28463-5-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | x86/cpu: Convert printk(KERN_<LEVEL> ...) to pr_<level>(...)Chen Yucong2016-02-031-10/+10
|/ | | | | | | | | | | | | | | | - Use the more current logging style pr_<level>(...) instead of the old printk(KERN_<LEVEL> ...). - Convert pr_warning() to pr_warn(). Signed-off-by: Chen Yucong <slaoub@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1454384702-21707-1-git-send-email-slaoub@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86/cpufeature: Remove unused and seldomly used cpu_has_xx macrosBorislav Petkov2015-12-191-1/+1
| | | | | | | | | | | | | | | | | | Those are stupid and code should use static_cpu_has_safe() or boot_cpu_has() instead. Kill the least used and unused ones. The remaining ones need more careful inspection before a conversion can happen. On the TODO. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1449481182-27541-4-git-send-email-bp@alien8.de Cc: David Sterba <dsterba@suse.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Matt Mackall <mpm@selenic.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <jbacik@fb.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* x86/mm/mtrr: Remove kernel internal MTRR interfaces: unexport mtrr_add() and ↵Luis R. Rodriguez2015-08-281-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mtrr_del() The effort to replace mtrr_add() with architecture agnostic arch_phys_wc_add() is complete, this will ensure write-combining implementations (PAT on x86) is taken advantage instead of using MTRR. With the effort done now, hide direct MTRR access for drivers. The legacy user-space /proc/mtrr ABI is not affected. Update x86 documentation on MTRR to reflect the completion of the phasing out of direct access to MTRR, also add a note on platform firmware code use of MTRRs based on the obituary discussion of MTRRs on Linux [0]. [0] http://lkml.kernel.org/r/1438991330.3109.196.camel@hp.com Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Cc: <syrjala@sci.fi> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Walls <awalls@md.metrocast.net> Cc: Antonino Daplas <adaplas@gmail.com> Cc: Borislav Petkov <bp@suse.de> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Doug Ledford <dledford@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Ville Syrjälä <syrjala@sci.fi> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: airlied@linux.ie Cc: benh@kernel.crashing.org Cc: bhelgaas@google.com Cc: dan.j.williams@intel.com Cc: konrad.wilk@oracle.com Cc: linux-fbdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: mst@redhat.com Cc: netdev@vger.kernel.org Cc: vinod.koul@intel.com Cc: xen-devel@lists.xensource.com Link: http://lkml.kernel.org/r/1440443613-13696-12-git-send-email-mcgrof@do-not-panic.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86/mm/pat: Wrap pat_enabled into a function APILuis R. Rodriguez2015-05-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use pat_enabled in x86-specific code to see if PAT is enabled or not but we're granting full access to it even though readers do not need to set it. If, for instance, we granted access to it to modules later they then could override the variable setting... no bueno. This renames pat_enabled to a new static variable __pat_enabled. Folks are redirected to use pat_enabled() now. Code that sets this can only be internal to pat.c. Apart from the early kernel parameter "nopat" to disable PAT, we also have a few cases that disable it later and make use of a helper pat_disable(). It is wrapped under an ifdef but since that code cannot run unless PAT was enabled its not required to wrap it with ifdefs, unwrap that. Likewise, since "nopat" doesn't really change non-PAT systems just remove that ifdef as well. Although we could add and use an early_param_off(), these helpers don't use __read_mostly but we want to keep __read_mostly for __pat_enabled as this is a hot path -- upon boot, for instance, a simple guest may see ~4k accesses to pat_enabled(). Since __read_mostly early boot params are not that common we don't add a helper for them just yet. Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Walls <awalls@md.metrocast.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Doug Ledford <dledford@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kyle McMartin <kyle@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1430425520-22275-3-git-send-email-mcgrof@do-not-panic.com Link: http://lkml.kernel.org/r/1432628901-18044-13-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86/mm/mtrr: Generalize runtime disabling of MTRRsLuis R. Rodriguez2015-05-271-8/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible to enable CONFIG_MTRR and CONFIG_X86_PAT and end up with a system with MTRR functionality disabled but PAT functionality enabled. This can happen, for instance, when the Xen hypervisor is used where MTRRs are not supported but PAT is. This can happen on Linux as of commit 47591df50512 ("xen: Support Xen pv-domains using PAT") by Juergen, introduced in v3.19. Technically, we should assume the proper CPU bits would be set to disable MTRRs but we can't always rely on this. At least on the Xen Hypervisor, for instance, only X86_FEATURE_MTRR was disabled as of Xen 4.4 through Xen commit 586ab6a [0], but not X86_FEATURE_K6_MTRR, X86_FEATURE_CENTAUR_MCR, or X86_FEATURE_CYRIX_ARR for instance. Roger Pau Monné has clarified though that although this is technically true we will never support PVH on these CPU types so Xen has no need to disable these bits on those systems. As per Roger, AMD K6, Centaur and VIA chips don't have the necessary hardware extensions to allow running PVH guests [1]. As per Toshi it is also possible for the BIOS to disable MTRR support, in such cases get_mtrr_state() would update the MTRR state as per the BIOS, we need to propagate this information as well. x86 MTRR code relies on quite a bit of checks for mtrr_if being set to check to see if MTRRs did get set up. Instead, lets provide a generic getter for that. This also adds a few checks where they were not before which could potentially safeguard ourselves against incorrect usage of MTRR where this was not desirable. Where possible match error codes as if MTRRs were disabled on arch/x86/include/asm/mtrr.h. Lastly, since disabling MTRRs can happen at run time and we could end up with PAT enabled, best record now in our logs when MTRRs are disabled. [0] ~/devel/xen (git::stable-4.5)$ git describe --contains 586ab6a 4.4.0-rc1~18 [1] http://lists.xenproject.org/archives/html/xen-devel/2015-03/msg03460.html Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Antonino Daplas <adaplas@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Stefan Bader <stefan.bader@canonical.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Ville Syrjälä <syrjala@sci.fi> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: bhelgaas@google.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: konrad.wilk@oracle.com Cc: venkatesh.pallipadi@intel.com Cc: ville.syrjala@linux.intel.com Cc: xen-devel@lists.xensource.com Link: http://lkml.kernel.org/r/1426893517-2511-3-git-send-email-mcgrof@do-not-panic.com Link: http://lkml.kernel.org/r/1432628901-18044-12-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86/mm/mtrr: Avoid #ifdeffery with phys_wc_to_mtrr_index()Luis R. Rodriguez2015-05-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is only one user but since we're going to bury MTRR next out of access to drivers, expose this last piece of API to drivers in a general fashion only needing io.h for access to helpers. Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Abhilash Kesavan <a.kesavan@samsung.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Antonino Daplas <adaplas@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Cristian Stoica <cristian.stoica@freescale.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thierry Reding <treding@nvidia.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Ville Syrjälä <syrjala@sci.fi> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will.deacon@arm.com> Cc: dri-devel@lists.freedesktop.org Link: http://lkml.kernel.org/r/1429722736-4473-1-git-send-email-mcgrof@do-not-panic.com Link: http://lkml.kernel.org/r/1432628901-18044-11-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86/mm/mtrr, pat: Document Write Combining MTRR type effects on PAT / ↵Luis R. Rodriguez2015-05-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | non-PAT pages As part of the effort to phase out MTRR use document write-combining MTRR effects on pages with different non-PAT page attributes flags and different PAT entry values. Extend arch_phys_wc_add() documentation to clarify power of two sizes / boundary requirements as we phase out mtrr_add() use. Lastly hint towards ioremap_uc() for corner cases on device drivers working with devices with mixed regions where MTRR size requirements would otherwise not enable write-combining effective memory types. Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Antonino Daplas <adaplas@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Ville Syrjälä <syrjala@sci.fi> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: linux-fbdev@vger.kernel.org Link: http://lkml.kernel.org/r/1430343851-967-3-git-send-email-mcgrof@do-not-panic.com Link: http://lkml.kernel.org/r/1432628901-18044-10-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86: Add more disabled featuresDave Hansen2014-09-111-3/+3
| | | | | | | | | | | | | | | | | | | The original motivation for these patches was for an Intel CPU feature called MPX. The patch to add a disabled feature for it will go in with the other parts of the support. But, in the meantime, there are a few other features than MPX that we can make assumptions about at compile-time based on compile options. Add them to disabled-features.h and check them with cpu_feature_enabled(). Note that this gets rid of the last things that needed an #ifdef CONFIG_X86_64 in cpufeature.h. Yay! Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/20140911211524.C0EC332A@viggo.jf.intel.com Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds2013-07-091-0/+71
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm updates from Dave Airlie: "Okay this is the big one, I was stalled on the fbdev pull req as I stupidly let fbdev guys merge a patch I required to fix a warning with some patches I had, they ended up merging the patch from the wrong place, but the warning should be fixed. In future I'll just take the patch myself! Outside drm: There are some snd changes for the HDMI audio interactions on haswell, they've been acked for inclusion via my tree. This relies on the wound/wait tree from Ingo which is already merged. Major changes: AMD finally released the dynamic power management code for all their GPUs from r600->present day, this is great, off by default for now but also a huge amount of code, in fact it is most of this pull request. Since it landed there has been a lot of community testing and Alex has sent a lot of fixes for any bugs found so far. I suspect radeon might now be the biggest kernel driver ever :-P p.s. radeon.dpm=1 to enable dynamic powermanagement for anyone. New drivers: Renesas r-car display unit. Other highlights: - core: GEM CMA prime support, use new w/w mutexs for TTM reservations, cursor hotspot, doc updates - dvo chips: chrontel 7010B support - i915: Haswell (fbc, ips, vecs, watermarks, audio powerwell), Valleyview (enabled by default, rc6), lots of pll reworking, 30bpp support (this time for sure) - nouveau: async buffer object deletion, context/register init updates, kernel vp2 engine support, GF117 support, GK110 accel support (with external nvidia ucode), context cleanups. - exynos: memory leak fixes, Add S3C64XX SoC series support, device tree updates, common clock framework support, - qxl: cursor hotspot support, multi-monitor support, suspend/resume support - mgag200: hw cursor support, g200 mode limiting - shmobile: prime support - tegra: fixes mostly I've been banging on this quite a lot due to the size of it, and it seems to okay on everything I've tested it on." * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (811 commits) drm/radeon/dpm: implement vblank_too_short callback for si drm/radeon/dpm: implement vblank_too_short callback for cayman drm/radeon/dpm: implement vblank_too_short callback for btc drm/radeon/dpm: implement vblank_too_short callback for evergreen drm/radeon/dpm: implement vblank_too_short callback for 7xx drm/radeon/dpm: add checks against vblank time drm/radeon/dpm: add helper to calculate vblank time drm/radeon: remove stray line in old pm code drm/radeon/dpm: fix display_gap programming on rv7xx drm/nvc0/gr: fix gpc firmware regression drm/nouveau: fix minor thinko causing bo moves to not be async on kepler drm/radeon/dpm: implement force performance level for TN drm/radeon/dpm: implement force performance level for ON/LN drm/radeon/dpm: implement force performance level for SI drm/radeon/dpm: implement force performance level for cayman drm/radeon/dpm: implement force performance levels for 7xx/eg/btc drm/radeon/dpm: add infrastructure to force performance levels drm/radeon: fix surface setup on r1xx drm/radeon: add support for 3d perf states on older asics drm/radeon: set default clocks for SI when DPM is disabled ...
| * Add arch_phys_wc_{add, del} to manipulate WC MTRRs if neededAndy Lutomirski2013-05-311-0/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several drivers currently use mtrr_add through various #ifdef guards and/or drm wrappers. The vast majority of them want to add WC MTRRs on x86 systems and don't actually need the MTRR if PAT (i.e. ioremap_wc, etc) are working. arch_phys_wc_add and arch_phys_wc_del are new functions, available on all architectures and configurations, that add WC MTRRs on x86 if needed (and handle errors) and do nothing at all otherwise. They're also easier to use than mtrr_add and mtrr_del, so the call sites can be simplified. As an added benefit, this will avoid wasting MTRRs and possibly warning pointlessly on PAT-supporting systems. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Dave Airlie <airlied@redhat.com>
* | x86: Fix /proc/mtrr with base/size more than 44bitsYinghai Lu2013-06-251-7/+9
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On one sytem that mtrr range is more then 44bits, in dmesg we have [ 0.000000] MTRR default type: write-back [ 0.000000] MTRR fixed ranges enabled: [ 0.000000] 00000-9FFFF write-back [ 0.000000] A0000-BFFFF uncachable [ 0.000000] C0000-DFFFF write-through [ 0.000000] E0000-FFFFF write-protect [ 0.000000] MTRR variable ranges enabled: [ 0.000000] 0 [000080000000-0000FFFFFFFF] mask 3FFF80000000 uncachable [ 0.000000] 1 [380000000000-38FFFFFFFFFF] mask 3F0000000000 uncachable [ 0.000000] 2 [000099000000-000099FFFFFF] mask 3FFFFF000000 write-through [ 0.000000] 3 [00009A000000-00009AFFFFFF] mask 3FFFFF000000 write-through [ 0.000000] 4 [381FFA000000-381FFBFFFFFF] mask 3FFFFE000000 write-through [ 0.000000] 5 [381FFC000000-381FFC0FFFFF] mask 3FFFFFF00000 write-through [ 0.000000] 6 [0000AD000000-0000ADFFFFFF] mask 3FFFFF000000 write-through [ 0.000000] 7 [0000BD000000-0000BDFFFFFF] mask 3FFFFF000000 write-through [ 0.000000] 8 disabled [ 0.000000] 9 disabled but /proc/mtrr report wrong: reg00: base=0x080000000 ( 2048MB), size= 2048MB, count=1: uncachable reg01: base=0x80000000000 (8388608MB), size=1048576MB, count=1: uncachable reg02: base=0x099000000 ( 2448MB), size= 16MB, count=1: write-through reg03: base=0x09a000000 ( 2464MB), size= 16MB, count=1: write-through reg04: base=0x81ffa000000 (8519584MB), size= 32MB, count=1: write-through reg05: base=0x81ffc000000 (8519616MB), size= 1MB, count=1: write-through reg06: base=0x0ad000000 ( 2768MB), size= 16MB, count=1: write-through reg07: base=0x0bd000000 ( 3024MB), size= 16MB, count=1: write-through reg08: base=0x09b000000 ( 2480MB), size= 16MB, count=1: write-combining so bit 44 and bit 45 get cut off. We have problems in arch/x86/kernel/cpu/mtrr/generic.c::generic_get_mtrr(). 1. for base, we miss cast base_lo to 64bit before shifting. Fix that by adding u64 casting. 2. for size, it only can handle 44 bits aka 32bits + page_shift Fix that with 64bit mask instead of 32bit mask_lo, then range could be more than 44bits. At the same time, we need to update size_or_mask for old cpus that does support cpuid 0x80000008 to get phys_addr. Need to set high 32bits to all 1s, otherwise will not get correct size for them. Also fix mtrr_add_page: it should check base and (base + size - 1) instead of base and size, as base and size could be small but base + size could bigger enough to be out of boundary. We can use boot_cpu_data.x86_phys_bits directly to avoid size_or_mask. So When are we going to have size more than 44bits? that is 16TiB. after patch we have right ouput: reg00: base=0x080000000 ( 2048MB), size= 2048MB, count=1: uncachable reg01: base=0x380000000000 (58720256MB), size=1048576MB, count=1: uncachable reg02: base=0x099000000 ( 2448MB), size= 16MB, count=1: write-through reg03: base=0x09a000000 ( 2464MB), size= 16MB, count=1: write-through reg04: base=0x381ffa000000 (58851232MB), size= 32MB, count=1: write-through reg05: base=0x381ffc000000 (58851264MB), size= 1MB, count=1: write-through reg06: base=0x0ad000000 ( 2768MB), size= 16MB, count=1: write-through reg07: base=0x0bd000000 ( 3024MB), size= 16MB, count=1: write-through reg08: base=0x09b000000 ( 2480MB), size= 16MB, count=1: write-combining -v2: simply checking in mtrr_add_page according to hpa. [ hpa: This probably wants to go into -stable only after having sat in mainline for a bit. It is not a regression. ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1371162815-29931-1-git-send-email-yinghai@kernel.org Cc: <stable@vger.kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2012-12-131-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial branch from Jiri Kosina: "Usual stuff -- comment/printk typo fixes, documentation updates, dead code elimination." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) HOWTO: fix double words typo x86 mtrr: fix comment typo in mtrr_bp_init propagate name change to comments in kernel source doc: Update the name of profiling based on sysfs treewide: Fix typos in various drivers treewide: Fix typos in various Kconfig wireless: mwifiex: Fix typo in wireless/mwifiex driver messages: i2o: Fix typo in messages/i2o scripts/kernel-doc: check that non-void fcts describe their return value Kernel-doc: Convention: Use a "Return" section to describe return values radeon: Fix typo and copy/paste error in comments doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c various: Fix spelling of "asynchronous" in comments. Fix misspellings of "whether" in comments. eisa: Fix spelling of "asynchronous". various: Fix spelling of "registered" in comments. doc: fix quite a few typos within Documentation target: iscsi: fix comment typos in target/iscsi drivers treewide: fix typo of "suport" in various comments and Kconfig treewide: fix typo of "suppport" in various comments ...
| * x86 mtrr: fix comment typo in mtrr_bp_initOlaf Hering2012-12-071-1/+1
| | | | | | | | | | Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | x86, hotplug: The first online processor saves the MTRR stateFenghua Yu2012-11-141-2/+7
|/ | | | | | | | | Ask the first online CPU to save mtrr instead of asking BSP. BSP could be offline when mtrr_save_state() is called. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1352835171-3958-12-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* mtrr: fix UP breakage caused during switch to stop_machineTejun Heo2011-08-251-2/+0
| | | | | | | | | | | | | | | While removing custom rendezvous code and switching to stop_machine, commit 192d8857427d ("x86, mtrr: use stop_machine APIs for doing MTRR rendezvous") completely dropped mtrr setting code on !CONFIG_SMP breaking MTRR settting on UP. Fix it by removing the incorrect CONFIG_SMP. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Anders Eriksson <aeriksson@fastmail.fm> Tested-and-acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86, mtrr: Use pci_dev->revisionSergei Shtylyov2011-07-021-8/+5
| | | | | | | | | | | | | | This code uses PCI_CLASS_REVISION instead of PCI_REVISION_ID, so it wasn't converted by commit 44c10138fd4 ("PCI: Change all drivers to use pci_device->revision") before being moved to arch/x86/... Do it now at last -- and save one level of indentation... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/201107012242.08347.sshtylyov@ru.mvista.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86, mtrr: use stop_machine APIs for doing MTRR rendezvousSuresh Siddha2011-06-271-151/+41
| | | | | | | | | | | | | | | | | MTRR rendezvous sequence is not implemened using stop_machine() before, as this gets called both from the process context aswell as the cpu online paths (where the cpu has not come online and the interrupts are disabled etc). Now that we have a new stop_machine_from_inactive_cpu() API, use it for rendezvous during mtrr init of a logical processor that is coming online. For the rest (runtime MTRR modification, system boot, resume paths), use stop_machine() to implement the rendezvous sequence. This will consolidate and cleanup the code. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/20110623182057.076997177@sbsiddha-MOBL3.sc.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86, mtrr: lock stop machine during MTRR rendezvous sequenceSuresh Siddha2011-06-271-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially happen in parallel with another system wide rendezvous using stop_machine(). This can lead to deadlock (The order in which works are queued can be different on different cpu's. Some cpu's will be running the first rendezvous handler and others will be running the second rendezvous handler. Each set waiting for the other set to join for the system wide rendezvous, leading to a deadlock). MTRR rendezvous sequence is not implemented using stop_machine() as this gets called both from the process context aswell as the cpu online paths (where the cpu has not come online and the interrupts are disabled etc). stop_machine() works with only online cpus. For now, take the stop_machine mutex in the MTRR rendezvous sequence that gets called from an online cpu (here we are in the process context and can potentially sleep while taking the mutex). And the MTRR rendezvous that gets triggered during cpu online doesn't need to take this stop_machine lock (as the stop_machine() already ensures that there is no cpu hotplug going on in parallel by doing get_online_cpus()) TBD: Pursue a cleaner solution of extending the stop_machine() infrastructure to handle the case where the calling cpu is still not online and use this for MTRR rendezvous sequence. fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008 Reported-by: Vadim Kotelnikov <vadimuzzz@inbox.ru> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/20110623182056.807230326@sbsiddha-MOBL3.sc.intel.com Cc: stable@kernel.org # 2.6.35+, backport a week or two after this gets more testing in mainline Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86, mtrr, pat: Fix one cpu getting out of sync during resumeSuresh Siddha2011-03-291-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On laptops with core i5/i7, there were reports that after resume graphics workloads were performing poorly on a specific AP, while the other cpu's were ok. This was observed on a 32bit kernel specifically. Debug showed that the PAT init was not happening on that AP during resume and hence it contributing to the poor workload performance on that cpu. On this system, resume flow looked like this: 1. BP starts the resume sequence and we reinit BP's MTRR's/PAT early on using mtrr_bp_restore() 2. Resume sequence brings all AP's online 3. Resume sequence now kicks off the MTRR reinit on all the AP's. 4. For some reason, between point 2 and 3, we moved from BP to one of the AP's. My guess is that printk() during resume sequence is contributing to this. We don't see similar behavior with the 64bit kernel but there is no guarantee that at this point the remaining resume sequence (after AP's bringup) has to happen on BP. 5. set_mtrr() was assuming that we are still on BP and skipped the MTRR/PAT init on that cpu (because of 1 above) 6. But we were on an AP and this led to not reprogramming PAT on this cpu leading to bad performance. Fix this by doing unconditional mtrr_if->set_all() in set_mtrr() during MTRR/PAT init. This might be unnecessary if we are still running on BP. But it is of no harm and will guarantee that after resume, all the cpu's will be in sync with respect to the MTRR/PAT registers. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1301438292-28370-1-git-send-email-eric@anholt.net> Signed-off-by: Eric Anholt <eric@anholt.net> Tested-by: Keith Packard <keithp@keithp.com> Cc: stable@kernel.org [v2.6.32+] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86: Use syscore_ops instead of sysdev classes and sysdevsRafael J. Wysocki2011-03-231-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | Some subsystems in the x86 tree need to carry out suspend/resume and shutdown operations with one CPU on-line and interrupts disabled and they define sysdev classes and sysdevs or sysdev drivers for this purpose. This leads to unnecessarily complicated code and excessive memory usage, so switch them to using struct syscore_ops objects for this purpose instead. Generally, there are three categories of subsystems that use sysdevs for implementing PM operations: (1) subsystems whose suspend/resume callbacks ignore their arguments entirely (the majority), (2) subsystems whose suspend/resume callbacks use their struct sys_device argument, but don't really need to do that, because they can be implemented differently in an arguably simpler way (io_apic.c), and (3) subsystems whose suspend/resume callbacks use their struct sys_device argument, but the value of that argument is always the same and could be ignored (microcode_core.c). In all of these cases the subsystems in question may be readily converted to using struct syscore_ops objects for power management and shutdown. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu>
* x86, mtrr: Avoid MTRR reprogramming on BP during boot on UP platformsSuresh Siddha2011-02-031-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Markus Kohn ran into a hard hang regression on an acer aspire 1310, when acpi is enabled. git bisect showed the following commit as the bad one that introduced the boot regression. commit d0af9eed5aa91b6b7b5049cae69e5ea956fd85c3 Author: Suresh Siddha <suresh.b.siddha@intel.com> Date: Wed Aug 19 18:05:36 2009 -0700 x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init Because of the UP configuration of that platform, native_smp_prepare_cpus() bailed out (in smp_sanity_check()) before doing the set_mtrr_aps_delayed_init() Further down the boot path, native_smp_cpus_done() will call the delayed MTRR initialization for the AP's (mtrr_aps_init()) with mtrr_aps_delayed_init not set. This resulted in the boot processor reprogramming its MTRR's to the values seen during the start of the OS boot. While this is not needed ideally, this shouldn't have caused any side-effects. This is because the reprogramming of MTRR's (set_mtrr_state() that gets called via set_mtrr()) will check if the live register contents are different from what is being asked to write and will do the actual write only if they are different. BP's mtrr state is read during the start of the OS boot and typically nothing would have changed when we ask to reprogram it on BP again because of the above scenario on an UP platform. So on a normal UP platform no reprogramming of BP MTRR MSR's happens and all is well. However, on this platform, bios seems to be modifying the fixed mtrr range registers between the start of OS boot and when we double check the live registers for reprogramming BP MTRR registers. And as the live registers are modified, we end up reprogramming the MTRR's to the state seen during the start of the OS boot. During ACPI initialization, something in the bios (probably smi handler?) don't like this fact and results in a hard lockup. We didn't see this boot hang issue on this platform before the commit d0af9eed5aa91b6b7b5049cae69e5ea956fd85c3, because only the AP's (if any) will program its MTRR's to the value that BP had at the start of the OS boot. Fix this issue by checking mtrr_aps_delayed_init before continuing further in the mtrr_aps_init(). Now, only AP's (if any) will program its MTRR's to the BP values during boot. Addresses https://bugzilla.novell.com/show_bug.cgi?id=623393 [ By the way, this behavior of the bios modifying MTRR's after the start of the OS boot is not common and the kernel is not prepared to handle this situation well. Irrespective of this issue, during suspend/resume, linux kernel will try to reprogram the BP's MTRR values to the values seen during the start of the OS boot. So suspend/resume might be already broken on this platform for all linux kernel versions. ] Reported-and-bisected-by: Markus Kohn <jabber@gmx.org> Tested-by: Markus Kohn <jabber@gmx.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Thomas Renninger <trenn@novell.com> Cc: Rafael Wysocki <rjw@novell.com> Cc: Venkatesh Pallipadi <venki@google.com> Cc: stable@kernel.org # [v2.6.32+] LKML-Reference: <1296694975.4418.402.camel@sbsiddha-MOBL3.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86, mtrr: Use stop machine context to rendezvous all the cpu'sSuresh Siddha2010-07-301-13/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the stop machine context rather than IPI's to rendezvous all the cpus for MTRR initialization that happens during cpu bringup or for MTRR modifications during runtime. This avoids deadlock scenario (reported by Prarit) like: cpu A holds a read_lock (tasklist_lock for example) with irqs enabled cpu B waits for the same lock with irqs disabled using write_lock_irq cpu C doing set_mtrr() (during AP bringup for example), which will try to rendezvous all the cpus using IPI's This will result in C and A come to the rendezvous point and waiting for B. B is stuck forever waiting for the lock and thus not reaching the rendezvous point. Using stop cpu (run in the process context of per cpu based keventd) to do this rendezvous, avoids this deadlock scenario. Also make sure all the cpu's are in the rendezvous handler before we proceed with the local_irq_save() on each cpu. This lock step disabling irqs on all the cpus will avoid other deadlock scenarios (for example involving with the blocking smp_call_function's etc). [ This problem is very old. Marking -stable only for 2.6.35 as the stop_one_cpu_nowait() API is present only in 2.6.35. Any older kernel interested in this fix need to do some more work in backporting this patch. ] Reported-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1280515602.2682.10.camel@sbsiddha-MOBL3.sc.intel.com> Acked-by: Prarit Bhargava <prarit@redhat.com> Cc: stable@kernel.org [2.6.35] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86: fix mtrr missing kernel-docRandy Dunlap2010-03-051-0/+1
| | | | | | | | | Fix missing kernel-doc notation in mtrr/main.c: Warning(arch/x86/kernel/cpu/mtrr/main.c:152): No description found for parameter 'info' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86, mtrr: Constify struct mtrr_opsEmese Revfy2010-02-011-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | This is part of the ops structure constification effort started by Arjan van de Ven et al. Benefits of this constification: * prevents modification of data that is shared (referenced) by many other structure instances at runtime * detects/prevents accidental (but not intentional) modification attempts on archs that enforce read-only kernel data at runtime * potentially better optimized code as the compiler can assume that the const data cannot be changed * the compiler/linker move const data into .rodata and therefore exclude them from false sharing Signed-off-by: Emese Revfy <re.emese@gmail.com> LKML-Reference: <4B65D712.3080804@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86, mtrr: make mtrr_aps_delayed_init static boolH. Peter Anvin2009-08-211-3/+3
| | | | | | | | | | mtr_aps_delayed_init was declared u32 and made global, but it only ever takes boolean values and is only ever used in arch/x86/kernel/cpu/mtrr/main.c. Declare it "static bool" and remove external references. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com>
* x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT initSuresh Siddha2009-08-211-9/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SDM Vol 3a section titled "MTRR considerations in MP systems" specifies the need for synchronizing the logical cpu's while initializing/updating MTRR. Currently Linux kernel does the synchronization of all cpu's only when a single MTRR register is programmed/updated. During an AP online (during boot/cpu-online/resume) where we initialize all the MTRR/PAT registers, we don't follow this synchronization algorithm. This can lead to scenarios where during a dynamic cpu online, that logical cpu is initializing MTRR/PAT with cache disabled (cr0.cd=1) etc while other logical HT sibling continue to run (also with cache disabled because of cr0.cd=1 on its sibling). Starting from Westmere, VMX transitions with cr0.cd=1 don't work properly (because of some VMX performance optimizations) and the above scenario (with one logical cpu doing VMX activity and another logical cpu coming online) can result in system crash. Fix the MTRR initialization by doing rendezvous of all the cpus. During boot and resume, we delay the MTRR/PAT init for APs till all the logical cpu's come online and the rendezvous process at the end of AP's bringup, will initialize the MTRR/PAT for all AP's. For dynamic single cpu online, we synchronize all the logical cpus and do the MTRR/PAT init on the AP that is coming online. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86: Clean up mtrr/main.cJaswinder Singh Rajput2009-07-041-213/+242
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix following trivial style problems: ERROR: trailing whitespace X 25 WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h> WARNING: Use #include <linux/kvm_para.h> instead of <asm/kvm_para.h> ERROR: do not initialise externals to 0 or NULL X 2 ERROR: "foo * bar" should be "foo *bar" X 5 ERROR: do not use assignment in if condition X 2 WARNING: line over 80 characters X 8 ERROR: return is not a function, parentheses are not required WARNING: braces {} are not necessary for any arm of this statement ERROR: space required before the open parenthesis '(' X 2 ERROR: open brace '{' following function declarations go on the next line ERROR: space required after that ',' (ctx:VxV) X 8 ERROR: space required before the open parenthesis '(' X 3 ERROR: else should follow close brace '}' WARNING: space prohibited between function name and open parenthesis '(' WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable X 2 Also use pr_debug and pr_warning where possible. total: 50 errors, 14 warnings arch/x86/kernel/cpu/mtrr/main.o: text data bss dec hex filename 3668 116 4156 7940 1f04 main.o.before 3668 116 4156 7940 1f04 main.o.after md5: e01af2fd28deef77c8d01e71acfbd365 main.o.before.asm e01af2fd28deef77c8d01e71acfbd365 main.o.after.asm Suggested-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <20090703164225.GA21447@elte.hu> Cc: Avi Kivity <avi@redhat.com> # Avi, please have a look at the kvm_para.h bit [ More cleanups ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86, mtrr: replace MTRRcap_MSR with msr-index's MSR_MTRRcapJaswinder Singh Rajput2009-05-151-1/+1
| | | | | | | | | Use standard msr-index.h's MSR declaration and no need to declare again. [ Impact: cleanup, no object code change ] Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86: MTRR workaround for system with stange var MTRRsYinghai Lu2009-03-171-8/+8
| | | | | | | | | | | | | | | Impact: don't trim e820 according to wrong mtrr Ozan reports that his server emits strange warning. it turns out the BIOS sets the MTRRs incorrectly. Ignore those strange ranges, and don't trim e820, just emit one warning about BIOS Reported-by: Ozan Çağlayan <ozan@pardus.org.tr> Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <49BEE1E7.7020706@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: separate mtrr cleanup/mtrr_e820 trim to separate fileYinghai Lu2009-03-131-1054/+1
| | | | | | | | | | | Impact: cleanup mtrr main.c is too big, seperate mtrr cleanup and mtrr e820 trim code to another file. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <49B87C7B.80809@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: print out mtrr_range_state when user specify sizeYinghai Lu2009-03-131-0/+2
| | | | | | | | | | Impact: print more debug info Keep it consistent with autodetect version. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <49B87C0A.4010105@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: tone down mtrr_trim_uncached_memory() warningIngo Molnar2009-01-291-2/+1
| | | | | | | | | | | | | | | | | | | | kerneloops.org is reporting a lot of these warnings that come due to vmware not setting up any MTRRs for emulated CPUs: | Reported 709 times (14696 total reports) | BIOS bug (often in VMWare) where the MTRR's are set up incorrectly | or not at all | | This warning was last seen in version 2.6.29-rc2-git1, and first | seen in 2.6.24. | | More info: | http://www.kerneloops.org/searchweek.php?search=mtrr_trim_uncached_memory Keep a one-liner KERN_INFO about it - so that we have so notice if empty MTRRs are caused by native hardware/BIOS weirdness. Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'linus' into x86/cleanupsIngo Molnar2009-01-021-2/+2
|\ | | | | | | | | Conflicts: arch/x86/kernel/reboot.c
| * x86: Rename mtrr_state struct and macro namesSheng Yang2008-12-311-2/+2
| | | | | | | | | | | | | | Prepare for exporting them. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | Merge branch 'x86/core' into x86/cleanupsIngo Molnar2008-12-271-175/+171
|\|
| * x86: break up mtrr_cleanup() into several small functions.Yinghai Lu2008-10-281-175/+171
| | | | | | | | | | | | | | | | | | Ingo said mtrr_cleanup() is big and ugly. so break it up into more functions and make it more readable. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | x86: remove impossible test in mtrr/main.cRusty Russell2008-12-251-4/+2
|/ | | | | | | | | Impact: cleanup enable_mtrr_cleanup is static, and is never set to anything but 0 or 1. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
*-------. Merge branches 'x86/alternatives', 'x86/cleanups', 'x86/commandline', ↵Ingo Molnar2008-10-061-84/+188
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/doc', 'x86/exports', 'x86/fpu', 'x86/gart', 'x86/idle', 'x86/mm', 'x86/mtrr', 'x86/nmi-watchdog', 'x86/oprofile', 'x86/paravirt', 'x86/reboot', 'x86/sparse-fixes', 'x86/tsc', 'x86/urgent' and 'x86/vmalloc' into x86-v28-for-linus-phase1
| | | | * | x86: mtrr_cleanup: treat WRPROT as UNCACHEABLEYinghai Lu2008-10-041-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the purpose of MTRR canonicalization, treat WRPROT as UNCACHEABLE. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>