summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* tree-wide: use reinit_completion instead of INIT_COMPLETIONWolfram Sang2013-11-1586-138/+138
| | | | | | | | | | | | Use this new function to make code more comprehensible, since we are reinitialzing the completion, not initializing. [akpm@linux-foundation.org: linux-next resyncs] Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> (personally at LCE13) Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* sched: replace INIT_COMPLETION with reinit_completionWolfram Sang2013-11-151-3/+15
| | | | | | | | | | | | | | | | | For the casual device driver writer, it is hard to remember when to use init_completion (to init a completion structure) or INIT_COMPLETION (to *reinit* a completion structure). Furthermore, while all other completion functions exepct a pointer as a parameter, INIT_COMPLETION does not. To make it easier to remember which function to use and to make code more readable, introduce a new inline function with the proper name and consistent argument type. Update the kernel-doc for init_completion while we are here. Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> (personally at LCE13) Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers/rtc/rtc-hid-sensor-time.c: enable HID input processing earlyAlexander Holler2013-11-151-0/+7
| | | | | | | | | | | Enable the processing of HID input records before the RTC will be registered, in order to allow the RTC register function to read clock. Without doing that the clock can only be read after the probe function has finished. Signed-off-by: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers/rtc/rtc-hid-sensor-time.c: use dev_get_platdata()Jingoo Han2013-11-151-2/+2
| | | | | | | | | | Use the wrapper function for retrieving the platform data instead of accessing dev->platform_data directly. This is a cosmetic change to make the code simpler and enhance the readability. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* vsprintf: ignore %n againKees Cook2013-11-151-11/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | This ignores %n in printf again, as was originally documented. Implementing %n poses a greater security risk than utility, so it should stay ignored. To help anyone attempting to use %n, a warning will be emitted if it is encountered. Based on an earlier patch by Joe Perches. Because %n was designed to write to pointers on the stack, it has been frequently used as an attack vector when bugs are found that leak user-controlled strings into functions that ultimately process format strings. While this class of bug can still be turned into an information leak, removing %n eliminates the common method of elevating such a bug into an arbitrary kernel memory writing primitive, significantly reducing the danger of this class of bug. For seq_file users that need to know the length of a written string for padding, please see seq_setwidth() and seq_pad() instead. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Joe Perches <joe@perches.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* seq_file: remove "%n" usage from seq_file usersTetsuo Handa2013-11-1510-97/+73
| | | | | | | | | | | | All seq_printf() users are using "%n" for calculating padding size, convert them to use seq_setwidth() / seq_pad() pair. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Joe Perches <joe@perches.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* seq_file: introduce seq_setwidth() and seq_pad()Tetsuo Handa2013-11-152-0/+30
| | | | | | | | | | | | | | | | There are several users who want to know bytes written by seq_*() for alignment purpose. Currently they are using %n format for knowing it because seq_*() returns 0 on success. This patch introduces seq_setwidth() and seq_pad() for allowing them to align without using %n format. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Joe Perches <joe@perches.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lockref: use BLOATED_SPINLOCKS to avoid explicit config dependenciesPeter Zijlstra2013-11-153-9/+7
| | | | | | | | | | | | Avoid the fragile Kconfig construct guestimating spinlock_t sizes; use a friendly compile-time test to determine this. [kirill.shutemov@linux.intel.com: drop CONFIG_CMPXCHG_LOCKREF] Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: create a separate slab for page->ptl allocationKirill A. Shutemov2013-11-153-1/+17
| | | | | | | | | | | | | | | If DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC are enabled spinlock_t on x86_64 is 72 bytes. For page->ptl they will be allocated from kmalloc-96 slab, so we loose 24 on each. An average system can easily allocate few tens thousands of page->ptl and overhead is significant. Let's create a separate slab for page->ptl allocation to solve this. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: properly separate the bloated ptl from the regular casePeter Zijlstra2013-11-154-21/+25
| | | | | | | | | | | | Use kernel/bounds.c to convert build-time spinlock_t size check into a preprocessor symbol and apply that to properly separate the page::ptl situation. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: dynamically allocate page->ptl if it cannot be embedded to struct pageKirill A. Shutemov2013-11-156-24/+179
| | | | | | | | | | | | | | | | | | | | | | If split page table lock is in use, we embed the lock into struct page of table's page. We have to disable split lock, if spinlock_t is too big be to be embedded, like when DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC enabled. This patch add support for dynamic allocation of split page table lock if we can't embed it to struct page. page->ptl is unsigned long now and we use it as spinlock_t if sizeof(spinlock_t) <= sizeof(long), otherwise it's pointer to spinlock_t. The spinlock_t allocated in pgtable_page_ctor() for PTE table and in pgtable_pmd_page_ctor() for PMD table. All other helpers converted to support dynamically allocated page->ptl. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* xtensa: use buddy allocator for PTE tableKirill A. Shutemov2013-11-153-30/+13
| | | | | | | | | | | | | At the moment xtensa uses slab allocator for PTE table. It doesn't work with enabled split page table lock: slab uses page->slab_cache and page->first_page for its pages. These fields share stroage with page->ptl. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Chris Zankel <chris@zankel.net> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* iommu/arm-smmu: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-1/+4
| | | | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* xtensa: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-1/+4
| | | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-2/+6
| | | | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* unicore32: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-6/+8
| | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* um: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-2/+6
| | | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* tile: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-1/+5
| | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* sparc: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-152-6/+10
| | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* sh: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-1/+4
| | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* score: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-3/+6
| | | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Chen Liqin <liqin.chen@sunplusct.com> Acked-by: Lennox Wu <lennox.wu@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* s390: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-2/+9
| | | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* powerpc: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-153-5/+12
| | | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* parisc: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-2/+6
| | | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mips: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-3/+6
| | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* metag: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-2/+6
| | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: James Hogan <james.hogan@imgtec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* m68k: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-153-3/+14
| | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* m32r: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-1/+4
| | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ia64: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-1/+4
| | | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* hexagon: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-4/+6
| | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Richard Kuo <rkuo@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* frv: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-4/+8
| | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* cris: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-1/+4
| | | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mikael Starvik <starvik@axis.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* avr32: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-1/+4
| | | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* arm64: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-3/+6
| | | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* arm: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-5/+7
| | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* arc: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-3/+8
| | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> [for arch/arc bits] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* alpha: handle pgtable_page_ctor() failKirill A. Shutemov2013-11-151-1/+4
| | | | | | | | | Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* openrisc: add missing pgtable_page_ctor/dtor callsKirill A. Shutemov2013-11-151-2/+8
| | | | | | | | | | It will fix NR_PAGETABLE accounting. It's also required if the arch is going ever support split ptl. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Jonas Bonn <jonas@southpole.se> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mn10300: add missing pgtable_page_ctor/dtor callsKirill A. Shutemov2013-11-152-2/+8
| | | | | | | | | | | It will fix NR_PAGETABLE accounting. It's also required if the arch is going ever support split ptl. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* microblaze: add missing pgtable_page_ctor/dtor callsKirill A. Shutemov2013-11-151-3/+9
| | | | | | | | | | It will fix NR_PAGETABLE accounting. It's also required if the arch is going ever support split ptl. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michal Simek <monstr@monstr.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: allow pgtable_page_ctor() to failKirill A. Shutemov2013-11-151-1/+2
| | | | | | | | | | | Change pgtable_page_ctor() return type from void to bool. Returns true, if initialization is successful and false otherwise. Current implementation never fails, but it will change later. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* xtensa: fix potential NULL-pointer dereferenceKirill A. Shutemov2013-11-151-1/+5
| | | | | | | | | | Add missing check for memory allocation fail. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* m32r: fix potential NULL-pointer dereferenceKirill A. Shutemov2013-11-151-0/+2
| | | | | | | | | Add missing check for memory allocation fail. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* cris: fix potential NULL-pointer dereferenceKirill A. Shutemov2013-11-151-0/+2
| | | | | | | | | | Add missing check for memory allocation fail. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mikael Starvik <starvik@axis.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86: add missed pgtable_pmd_page_ctor/dtor calls for preallocated pmdsKirill A. Shutemov2013-11-151-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In split page table lock case, we embed spinlock_t into struct page. For obvious reason, we don't want to increase size of struct page if spinlock_t is too big, like with DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC or on -rt kernel. So we disable split page table lock, if spinlock_t is too big. This patchset allows to allocate the lock dynamically if spinlock_t is big. In this page->ptl is used to store pointer to spinlock instead of spinlock itself. It costs additional cache line for indirect access, but fix page fault scalability for multi-threaded applications. LOCK_STAT depends on DEBUG_SPINLOCK, so on current kernel enabling LOCK_STAT to analyse scalability issues breaks scalability. ;) The patchset mostly fixes this. Results for ./thp_memscale -c 80 -b 512M on 4-socket machine: baseline, no CONFIG_LOCK_STAT: 9.115460703 seconds time elapsed baseline, CONFIG_LOCK_STAT=y: 53.890567123 seconds time elapsed patched, no CONFIG_LOCK_STAT: 8.852250368 seconds time elapsed patched, CONFIG_LOCK_STAT=y: 11.069770759 seconds time elapsed Patch count is scary, but most of them trivial. Overview: Patches 1-4 Few bug fixes. No dependencies to other patches. Probably should applied as soon as possible. Patch 5 Changes signature of pgtable_page_ctor(). We will use it for dynamic lock allocation, so it can fail. Patches 6-8 Add missing constructor/destructor calls on few archs. It's fixes NR_PAGETABLE accounting and prepare to use split ptl. Patches 9-33 Add pgtable_page_ctor() fail handling to all archs. Patches 34 Finally adds support of dynamically-allocated page->pte. Also contains documentation for split page table lock. This patch (of 34): I've missed that we preallocate few pmds on pgd_alloc() if X86_PAE enabled. Let's add missed constructor/destructor calls. I haven't noticed it during testing since prep_new_page() clears page->mapping and therefore page->ptl. It's effectively equal to spin_lock_init(&page->ptl). Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Chris Zankel <chris@zankel.net> Cc: Christoph Lameter <cl@linux.com> Cc: David Howells <dhowells@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86, mm: enable split page table lock for PMD levelKirill A. Shutemov2013-11-152-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable PMD split page table lock for X86_64 and PAE. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: implement split page table lock for PMD levelKirill A. Shutemov2013-11-154-3/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The basic idea is the same as with PTE level: the lock is embedded into struct page of table's page. We can't use mm->pmd_huge_pte to store pgtables for THP, since we don't take mm->page_table_lock anymore. Let's reuse page->lru of table's page for that. pgtable_pmd_page_ctor() returns true, if initialization is successful and false otherwise. Current implementation never fails, but assumption that constructor can fail will help to port it to -rt where spinlock_t is rather huge and cannot be embedded into struct page -- dynamic allocation is required. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: convert the rest to new page table lock apiKirill A. Shutemov2013-11-154-59/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only trivial cases left. Let's convert them altogether. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm, hugetlb: convert hugetlbfs to use split pmd lockKirill A. Shutemov2013-11-157-54/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hugetlb supports multiple page sizes. We use split lock only for PMD level, but not for PUD. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm, thp: do not access mm->pmd_huge_pte directlyKirill A. Shutemov2013-11-154-18/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently mm->pmd_huge_pte protected by page table lock. It will not work with split lock. We have to have per-pmd pmd_huge_pte for proper access serialization. For now, let's just introduce wrapper to access mm->pmd_huge_pte. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>