summaryrefslogtreecommitdiffstats
path: root/include/linux/slub_def.h
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'slab/for-linus' of ↵Linus Torvalds2012-03-281-2/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux Pull SLAB changes from Pekka Enberg: "There's the new kmalloc_array() API, minor fixes and performance improvements, but quite honestly, nothing terribly exciting." * 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: mm: SLAB Out-of-memory diagnostics slab: introduce kmalloc_array() slub: per cpu partial statistics change slub: include include for prefetch slub: Do not hold slub_lock when calling sysfs_slab_add() slub: prefetch next freelist pointer in slab_alloc() slab, cleanup: remove unneeded return
| * slub: per cpu partial statistics changeAlex Shi2012-02-181-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch split the cpu_partial_free into 2 parts: cpu_partial_node, PCP refilling times from node partial; and same name cpu_partial_free, PCP refilling times in slab_free slow path. A new statistic 'cpu_partial_drain' is added to get PCP drain to node partial times. These info are useful when do PCP tunning. The slabinfo.c code is unchanged, since cpu_partial_node is not on slow path. Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* | BUG: headers with BUG/BUG_ON etc. need linux/bug.hPaul Gortmaker2012-03-041-0/+1
|/ | | | | | | | | | | | | If a header file is making use of BUG, BUG_ON, BUILD_BUG_ON, or any other BUG variant in a static inline (i.e. not in a #define) then that header really should be including <linux/bug.h> and not just expecting it to be implicitly present. We can make this change risk-free, since if the files using these headers didn't have exposure to linux/bug.h already, they would have been causing compile failures/warnings. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* slub: correct comments error for per cpu partialAlex Shi2011-09-271-1/+1
| | | | | | | | | Correct comment errors, that mistake cpu partial objects number as pages number, may make reader misunderstand. Signed-off-by: Alex Shi <alex.shi@intel.com> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* slub: per cpu cache for partial pagesChristoph Lameter2011-08-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow filling out the rest of the kmem_cache_cpu cacheline with pointers to partial pages. The partial page list is used in slab_free() to avoid per node lock taking. In __slab_alloc() we can then take multiple partial pages off the per node partial list in one go reducing node lock pressure. We can also use the per cpu partial list in slab_alloc() to avoid scanning partial lists for pages with free objects. The main effect of a per cpu partial list is that the per node list_lock is taken for batches of partial pages instead of individual ones. Potential future enhancements: 1. The pickup from the partial list could be perhaps be done without disabling interrupts with some work. The free path already puts the page into the per cpu partial list without disabling interrupts. 2. __slab_free() may have some code paths that could use optimization. Performance: Before After ./hackbench 100 process 200000 Time: 1953.047 1564.614 ./hackbench 100 process 20000 Time: 207.176 156.940 ./hackbench 100 process 20000 Time: 204.468 156.940 ./hackbench 100 process 20000 Time: 204.879 158.772 ./hackbench 10 process 20000 Time: 20.153 15.853 ./hackbench 10 process 20000 Time: 20.153 15.986 ./hackbench 10 process 20000 Time: 19.363 16.111 ./hackbench 1 process 20000 Time: 2.518 2.307 ./hackbench 1 process 20000 Time: 2.258 2.339 ./hackbench 1 process 20000 Time: 2.864 2.163 Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* Merge branch 'slub/lockless' of ↵Linus Torvalds2011-07-301-0/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 * 'slub/lockless' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: (21 commits) slub: When allocating a new slab also prep the first object slub: disable interrupts in cmpxchg_double_slab when falling back to pagelock Avoid duplicate _count variables in page_struct Revert "SLUB: Fix build breakage in linux/mm_types.h" SLUB: Fix build breakage in linux/mm_types.h slub: slabinfo update for cmpxchg handling slub: Not necessary to check for empty slab on load_freelist slub: fast release on full slab slub: Add statistics for the case that the current slab does not match the node slub: Get rid of the another_slab label slub: Avoid disabling interrupts in free slowpath slub: Disable interrupts in free_debug processing slub: Invert locking and avoid slab lock slub: Rework allocator fastpaths slub: Pass kmem_cache struct to lock and freeze slab slub: explicit list_lock taking slub: Add cmpxchg_double_slab() mm: Rearrange struct page slub: Move page->frozen handling near where the page->freelist handling occurs slub: Do not use frozen page flag but a bit in the page counters ...
| * slub: fast release on full slabChristoph Lameter2011-07-021-0/+1
| | | | | | | | | | | | | | | | | | Make deactivation occur implicitly while checking out the current freelist. This avoids one cmpxchg operation on a slab that is now fully in use. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slub: Add statistics for the case that the current slab does not match the nodeChristoph Lameter2011-07-021-0/+1
| | | | | | | | | | | | | | | | Slub reloads the per cpu slab if the page does not satisfy the NUMA condition. Track those reloads since doing so has a performance impact. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slub: Add cmpxchg_double_slab()Christoph Lameter2011-07-021-0/+1
| | | | | | | | | | | | | | | | Add a function that operates on the second doubleword in the page struct and manipulates the object counters, the freelist and the frozen attribute. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* | slub: Add method to verify memory is not freedBen Greear2011-07-071-0/+13
| | | | | | | | | | | | | | | | This is for tracking down suspect memory usage. Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* | slab, slub, slob: Unify alignment definitionChristoph Lameter2011-06-161-10/+0
|/ | | | | | | | | | | | | | Every slab has its on alignment definition in include/linux/sl?b_def.h. Extract those and define a common set in include/linux/slab.h. SLOB: As notes sometimes we need double word alignment on 32 bit. This gives all structures allocated by SLOB a unsigned long long alignment like the others do. SLAB: If ARCH_SLAB_MINALIGN is not set SLAB would set ARCH_SLAB_MINALIGN to zero meaning no alignment at all. Give it the default unsigned long long alignment. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* slub: Deal with hyperthetical case of PAGE_SIZE > 2MChristoph Lameter2011-05-211-2/+4
| | | | | | | | | | | kmalloc_index() currently returns -1 if the PAGE_SIZE is larger than 2M which seems to cause some concern since the callers do not check for -1. Insert a BUG() and add a comment to the -1 explaining that the code cannot be reached. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* slub: Remove CONFIG_CMPXCHG_LOCAL ifdefferyChristoph Lameter2011-05-071-2/+0
| | | | | | | | | | | | | | | | | | Remove the #ifdefs. This means that the irqsafe_cpu_cmpxchg_double() is used everywhere. There may be performance implications since: A. We now have to manage a transaction ID for all arches B. The interrupt holdoff for arches not supporting CONFIG_CMPXCHG_LOCAL is reduced to a very short irqoff section. There are no multiple irqoff/irqon sequences as a result of this change. Even in the fallback case we only have to do one disable and enable like before. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* slub: Add statistics for this_cmpxchg_double failuresChristoph Lameter2011-03-221-0/+1
| | | | | | | Add some statistics for debugging. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* Merge branch 'slub/lockless' into for-linusPekka Enberg2011-03-201-2/+5
|\ | | | | | | | | Conflicts: include/linux/slub_def.h
| * Lockless (and preemptless) fastpaths for slubChristoph Lameter2011-03-111-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the this_cpu_cmpxchg_double functionality to implement a lockless allocation algorithm on arches that support fast this_cpu_ops. Each of the per cpu pointers is paired with a transaction id that ensures that updates of the per cpu information can only occur in sequence on a certain cpu. A transaction id is a "long" integer that is comprised of an event number and the cpu number. The event number is incremented for every change to the per cpu state. This means that the cmpxchg instruction can verify for an update that nothing interfered and that we are updating the percpu structure for the processor where we picked up the information and that we are also currently on that processor when we update the information. This results in a significant decrease of the overhead in the fastpaths. It also makes it easy to adopt the fast path for realtime kernels since this is lockless and does not require the use of the current per cpu area over the critical section. It is only important that the per cpu area is current at the beginning of the critical section and at the end. So there is no need even to disable preemption. Test results show that the fastpath cycle count is reduced by up to ~ 40% (alloc/free test goes from ~140 cycles down to ~80). The slowpath for kfree adds a few cycles. Sadly this does nothing for the slowpath which is where the main issues with performance in slub are but the best case performance rises significantly. (For that see the more complex slub patches that require cmpxchg_double) Kmalloc: alloc/free test Before: 10000 times kmalloc(8)/kfree -> 134 cycles 10000 times kmalloc(16)/kfree -> 152 cycles 10000 times kmalloc(32)/kfree -> 144 cycles 10000 times kmalloc(64)/kfree -> 142 cycles 10000 times kmalloc(128)/kfree -> 142 cycles 10000 times kmalloc(256)/kfree -> 132 cycles 10000 times kmalloc(512)/kfree -> 132 cycles 10000 times kmalloc(1024)/kfree -> 135 cycles 10000 times kmalloc(2048)/kfree -> 135 cycles 10000 times kmalloc(4096)/kfree -> 135 cycles 10000 times kmalloc(8192)/kfree -> 144 cycles 10000 times kmalloc(16384)/kfree -> 754 cycles After: 10000 times kmalloc(8)/kfree -> 78 cycles 10000 times kmalloc(16)/kfree -> 78 cycles 10000 times kmalloc(32)/kfree -> 82 cycles 10000 times kmalloc(64)/kfree -> 88 cycles 10000 times kmalloc(128)/kfree -> 79 cycles 10000 times kmalloc(256)/kfree -> 79 cycles 10000 times kmalloc(512)/kfree -> 85 cycles 10000 times kmalloc(1024)/kfree -> 82 cycles 10000 times kmalloc(2048)/kfree -> 82 cycles 10000 times kmalloc(4096)/kfree -> 85 cycles 10000 times kmalloc(8192)/kfree -> 82 cycles 10000 times kmalloc(16384)/kfree -> 706 cycles Kmalloc: Repeatedly allocate then free test Before: 10000 times kmalloc(8) -> 211 cycles kfree -> 113 cycles 10000 times kmalloc(16) -> 174 cycles kfree -> 115 cycles 10000 times kmalloc(32) -> 235 cycles kfree -> 129 cycles 10000 times kmalloc(64) -> 222 cycles kfree -> 120 cycles 10000 times kmalloc(128) -> 343 cycles kfree -> 139 cycles 10000 times kmalloc(256) -> 827 cycles kfree -> 147 cycles 10000 times kmalloc(512) -> 1048 cycles kfree -> 272 cycles 10000 times kmalloc(1024) -> 2043 cycles kfree -> 528 cycles 10000 times kmalloc(2048) -> 4002 cycles kfree -> 571 cycles 10000 times kmalloc(4096) -> 7740 cycles kfree -> 628 cycles 10000 times kmalloc(8192) -> 8062 cycles kfree -> 850 cycles 10000 times kmalloc(16384) -> 8895 cycles kfree -> 1249 cycles After: 10000 times kmalloc(8) -> 190 cycles kfree -> 129 cycles 10000 times kmalloc(16) -> 76 cycles kfree -> 123 cycles 10000 times kmalloc(32) -> 126 cycles kfree -> 124 cycles 10000 times kmalloc(64) -> 181 cycles kfree -> 128 cycles 10000 times kmalloc(128) -> 310 cycles kfree -> 140 cycles 10000 times kmalloc(256) -> 809 cycles kfree -> 165 cycles 10000 times kmalloc(512) -> 1005 cycles kfree -> 269 cycles 10000 times kmalloc(1024) -> 1999 cycles kfree -> 527 cycles 10000 times kmalloc(2048) -> 3967 cycles kfree -> 570 cycles 10000 times kmalloc(4096) -> 7658 cycles kfree -> 637 cycles 10000 times kmalloc(8192) -> 8111 cycles kfree -> 859 cycles 10000 times kmalloc(16384) -> 8791 cycles kfree -> 1173 cycles Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slub: min_partial needs to be in first cachelineChristoph Lameter2011-03-111-1/+1
| | | | | | | | | | | | | | | | It is used in unfreeze_slab() which is a performance critical function. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* | slub: automatically reserve bytes at the end of slabLai Jiangshan2011-03-111-0/+1
|/ | | | | | | | | | | | | | | | | There is no "struct" for slub's slab, it shares with struct page. But struct page is very small, it is insufficient when we need to add some metadata for slab. So we add a field "reserved" to struct kmem_cache, when a slab is allocated, kmem_cache->reserved bytes are automatically reserved at the end of the slab for slab's metadata. Changed from v1: Export the reserved field via sysfs Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* slub tracing: move trace calls out of always inlined functions to reduce ↵Richard Kennedy2010-11-061-29/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernel code size Having the trace calls defined in the always inlined kmalloc functions in include/linux/slub_def.h causes a lot of code duplication as the trace functions get instantiated for each kamalloc call site. This can simply be removed by pushing the trace calls down into the functions in slub.c. On my x86_64 built this patch shrinks the code size of the kernel by approx 36K and also shrinks the code size of many modules -- too many to list here ;) size vmlinux (2.6.36) reports text data bss dec hex filename 5410611 743172 828928 6982711 6a8c37 vmlinux 5373738 744244 828928 6946910 6a005e vmlinux + patch The resulting kernel has had some testing & kmalloc trace still seems to work. This patch - moves trace_kmalloc out of the inlined kmalloc() and pushes it down into kmem_cache_alloc_trace() so this it only get instantiated once. - rename kmem_cache_alloc_notrace() to kmem_cache_alloc_trace() to indicate that now is does have tracing. (maybe this would better being called something like kmalloc_kmem_cache ?) - adds a new function kmalloc_order() to handle allocation and tracing of large allocations of page order. - removes tracing from the inlined kmalloc_large() replacing them with a call to kmalloc_order(); - move tracing out of inlined kmalloc_node() and pushing it down into kmem_cache_alloc_node_trace - rename kmem_cache_alloc_node_notrace() to kmem_cache_alloc_node_trace() - removes the include of trace/events/kmem.h from slub_def.h. v2 - keep kmalloc_order_trace inline when !CONFIG_TRACE Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* slub: Enable sysfs support for !CONFIG_SLUB_DEBUGChristoph Lameter2010-10-061-1/+1
| | | | | | | | | | Currently disabling CONFIG_SLUB_DEBUG also disabled SYSFS support meaning that the slabs cannot be tuned without DEBUG. Make SYSFS support independent of CONFIG_SLUB_DEBUG Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* slub: reduce differences between SMP and NUMAChristoph Lameter2010-10-021-4/+1
| | | | | | | | | | Reduce the #ifdefs and simplify bootstrap by making SMP and NUMA as much alike as possible. This means that there will be an additional indirection to get to the kmem_cache_node field under SMP. Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* slub: Dynamically size kmalloc cache allocationsChristoph Lameter2010-10-021-5/+2
| | | | | | | | | | | | | | | | | | | kmalloc caches are statically defined and may take up a lot of space just because the sizes of the node array has to be dimensioned for the largest node count supported. This patch makes the size of the kmem_cache structure dynamic throughout by creating a kmem_cache slab cache for the kmem_cache objects. The bootstrap occurs by allocating the initial one or two kmem_cache objects from the page allocator. C2->C3 - Fix various issues indicated by David - Make create kmalloc_cache return a kmem_cache * pointer. Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2010-08-221-1/+1
|\ | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: slab: fix object alignment slub: add missing __percpu markup in mm/slub_def.h
| * slub: add missing __percpu markup in mm/slub_def.hNamhyung Kim2010-08-091-1/+1
| | | | | | | | | | | | | | | | | | kmem_cache->cpu_slab is a percpu pointer but was missing __percpu markup. Add it. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* | dma-mapping: rename ARCH_KMALLOC_MINALIGN to ARCH_DMA_MINALIGNFUJITA Tomonori2010-08-111-3/+5
|/ | | | | | | | | | | | | | | | | | | | | | | | | Now each architecture has the own dma_get_cache_alignment implementation. dma_get_cache_alignment returns the minimum DMA alignment. Architectures define it as ARCH_KMALLOC_MINALIGN (it's used to make sure that malloc'ed buffer is DMA-safe; the buffer doesn't share a cache with the others). So we can unify dma_get_cache_alignment implementations. This patch: dma_get_cache_alignment() needs to know if an architecture defines ARCH_KMALLOC_MINALIGN or not (needs to know if architecture has DMA alignment restriction). However, slab.h define ARCH_KMALLOC_MINALIGN if architectures doesn't define it. Let's rename ARCH_KMALLOC_MINALIGN to ARCH_DMA_MINALIGN. ARCH_KMALLOC_MINALIGN is used only in the internals of slab/slob/slub (except for crypto). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'perf/core' of ↵Ingo Molnar2010-06-091-1/+2
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
| * tracing: Remove kmemtrace ftrace pluginLi Zefan2010-06-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We have been resisting new ftrace plugins and removing existing ones, and kmemtrace has been superseded by kmem trace events and perf-kmem, so we remove it. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> [ remove kmemtrace from the makefile, handle slob too ] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
* | SLUB: Allow full duplication of kmalloc array for 390Christoph Lameter2010-05-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Commit 756dee75872a2a764b478e18076360b8a4ec9045 ("SLUB: Get rid of dynamic DMA kmalloc cache allocation") makes S390 run out of kmalloc caches. Increase the number of kmalloc caches to a safe size. Cc: <stable@kernel.org> [ .33 and .34 ] Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* | slub: move kmem_cache_node into it's own cachelineAlexander Duyck2010-05-241-6/+3
|/ | | | | | | | | | | | | | | | | | | | This patch is meant to improve the performance of SLUB by moving the local kmem_cache_node lock into it's own cacheline separate from kmem_cache. This is accomplished by simply removing the local_node when NUMA is enabled. On my system with 2 nodes I saw around a 5% performance increase w/ hackbench times dropping from 6.2 seconds to 5.9 seconds on average. I suspect the performance gain would increase as the number of nodes increases, but I do not have the data to currently back that up. Bugzilla-Reference: http://bugzilla.kernel.org/show_bug.cgi?id=15713 Cc: <stable@kernel.org> Reported-by: Alex Shi <alex.shi@intel.com> Tested-by: Alex Shi <alex.shi@intel.com> Acked-by: Yanmin Zhang <yanmin_zhang@linux.intel.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* mm: Move ARCH_SLAB_MINALIGN and ARCH_KMALLOC_MINALIGN to <linux/slub_def.h>David Woodhouse2010-05-191-0/+8
| | | | | | Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* SLUB: this_cpu: Remove slub kmem_cache fieldsChristoph Lameter2009-12-201-2/+0
| | | | | | | | | | | | | | | | | | | Remove the fields in struct kmem_cache_cpu that were used to cache data from struct kmem_cache when they were in different cachelines. The cacheline that holds the per cpu array pointer now also holds these values. We can cut down the struct kmem_cache_cpu size to almost half. The get_freepointer() and set_freepointer() functions that used to be only intended for the slow path now are also useful for the hot path since access to the size field does not require accessing an additional cacheline anymore. This results in consistent use of functions for setting the freepointer of objects throughout SLUB. Also we initialize all possible kmem_cache_cpu structures when a slab is created. No need to initialize them when a processor or node comes online. Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* SLUB: Get rid of dynamic DMA kmalloc cache allocationChristoph Lameter2009-12-201-8/+11
| | | | | | | | | Dynamic DMA kmalloc cache allocation is troublesome since the new percpu allocator does not support allocations in atomic contexts. Reserve some statically allocated kmalloc_cpu structures instead. Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* SLUB: Use this_cpu operations in slubChristoph Lameter2009-12-201-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Using per cpu allocations removes the needs for the per cpu arrays in the kmem_cache struct. These could get quite big if we have to support systems with thousands of cpus. The use of this_cpu_xx operations results in: 1. The size of kmem_cache for SMP configuration shrinks since we will only need 1 pointer instead of NR_CPUS. The same pointer can be used by all processors. Reduces cache footprint of the allocator. 2. We can dynamically size kmem_cache according to the actual nodes in the system meaning less memory overhead for configurations that may potentially support up to 1k NUMA nodes / 4k cpus. 3. We can remove the diddle widdle with allocating and releasing of kmem_cache_cpu structures when bringing up and shutting down cpus. The cpu alloc logic will do it all for us. Removes some portions of the cpu hotplug functionality. 4. Fastpath performance increases since per cpu pointer lookups and address calculations are avoided. V7-V8 - Convert missed get_cpu_slab() under CONFIG_SLUB_STATS Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* tracing, slab: Define kmem_cache_alloc_notrace ifdef CONFIG_TRACINGLi Zefan2009-12-111-2/+2
| | | | | | | | | | | | | | | | | Define kmem_trace_alloc_{,node}_notrace() if CONFIG_TRACING is enabled, otherwise perf-kmem will show wrong stats ifndef CONFIG_KMEM_TRACE, because a kmalloc() memory allocation may be traced by both trace_kmalloc() and trace_kmem_cache_alloc(). Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: linux-mm@kvack.org <linux-mm@kvack.org> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> LKML-Reference: <4B21F89A.7000801@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*-. Merge branches 'slab/cleanups' and 'slab/fixes' into for-linusPekka Enberg2009-09-141-6/+2
|\ \
| | * SLUB: fix ARCH_KMALLOC_MINALIGN cases 64 and 256Aaro Koskinen2009-08-301-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the minalign is 64 bytes, then the 96 byte cache should not be created because it would conflict with the 128 byte cache. If the minalign is 256 bytes, patching the size_index table should not result in a buffer overrun. The calculation "(i - 1) / 8" used to access size_index[] is moved to a separate function as suggested by Christoph Lameter. Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| * | slab: remove duplicate kmem_cache_init_late() declarationsWu Fengguang2009-08-061-2/+0
|/ / | | | | | | | | | | | | | | | | | | kmem_cache_init_late() has been declared in slab.h CC: Nick Piggin <npiggin@suse.de> CC: Matt Mackall <mpm@selenic.com> CC: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* / kmemleak: Trace the kmalloc_large* functions in slubCatalin Marinas2009-07-081-0/+2
|/ | | | | | | | | | The kmalloc_large() and kmalloc_large_node() functions were missed when adding the kmemleak hooks to the slub allocator. However, they should be traced to avoid false positives. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux-foundation.org> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
* slab,slub: don't enable interrupts during early bootPekka Enberg2009-06-121-0/+2
| | | | | | | | | | | | | | | | | | | As explained by Benjamin Herrenschmidt: Oh and btw, your patch alone doesn't fix powerpc, because it's missing a whole bunch of GFP_KERNEL's in the arch code... You would have to grep the entire kernel for things that check slab_is_available() and even then you'll be missing some. For example, slab_is_available() didn't always exist, and so in the early days on powerpc, we used a mem_init_done global that is set form mem_init() (not perfect but works in practice). And we still have code using that to do the test. Therefore, mask out __GFP_WAIT, __GFP_IO, and __GFP_FS in the slab allocators in early boot code to avoid enabling interrupts. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* tracing, kmemtrace: Separate include/trace/kmemtrace.h to kmemtrace part and ↵Zhaolei2009-04-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | tracepoint part Impact: refactor code for future changes Current kmemtrace.h is used both as header file of kmemtrace and kmem's tracepoints definition. Tracepoints' definition file may be used by other code, and should only have definition of tracepoint. We can separate include/trace/kmemtrace.h into 2 files: include/linux/kmemtrace.h: header file for kmemtrace include/trace/kmem.h: definition of kmem tracepoints Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <49DEE68A.5040902@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* kmemtrace: use tracepointsEduard - Gabriel Munteanu2009-04-031-8/+4
| | | | | | | | | | | | | kmemtrace now uses tracepoints instead of markers. We no longer need to use format specifiers to pass arguments. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> [ folded: Use the new TP_PROTO and TP_ARGS to fix the build. ] [ folded: fix build when CONFIG_KMEMTRACE is disabled. ] [ folded: define tracepoints when CONFIG_TRACEPOINTS is enabled. ] Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> LKML-Reference: <ae61c0f37156db8ec8dc0d5778018edde60a92e3.1237813499.git.eduard.munteanu@linux360.ro> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'tracing/core-v2' into tracing-for-linusIngo Molnar2009-04-021-3/+50
|\ | | | | | | | | | | | | | | Conflicts: include/linux/slub_def.h lib/Kconfig.debug mm/slob.c mm/slub.c
| * Merge branch 'for-ingo' of ↵Ingo Molnar2009-02-201-3/+16
| |\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 into tracing/kmemtrace Conflicts: mm/slub.c
| | * SLUB: Introduce and use SLUB_MAX_SIZE and SLUB_PAGE_SHIFT constantsChristoph Lameter2009-02-201-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As a preparational patch to bump up page allocator pass-through threshold, introduce two new constants SLUB_MAX_SIZE and SLUB_PAGE_SHIFT and convert mm/slub.c to use them. Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| * | tracing/kmemtrace: normalize the raw tracer event to the unified tracing APIFrederic Weisbecker2008-12-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: new tracer plugin This patch adapts kmemtrace raw events tracing to the unified tracing API. To enable and use this tracer, just do the following: echo kmemtrace > /debugfs/tracing/current_tracer cat /debugfs/tracing/trace You will have the following output: # tracer: kmemtrace # # # ALLOC TYPE REQ GIVEN FLAGS POINTER NODE CALLER # FREE | | | | | | | | # | type_id 1 call_site 18446744071565527833 ptr 18446612134395152256 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345164672 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345164912 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345165152 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 0 call_site 18446744071566144042 ptr 18446612134346191680 bytes_req 1304 bytes_alloc 1312 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 That was to stay backward compatible with the format output produced in inux/tracepoint.h. This is the default ouput, but note that I tried something else. If you change an option: echo kmem_minimalistic > /debugfs/trace_options and then cat /debugfs/trace, you will have the following output: # tracer: kmemtrace # # # ALLOC TYPE REQ GIVEN FLAGS POINTER NODE CALLER # FREE | | | | | | | | # | - C 0xffff88007c088780 file_free_rcu + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc780 -1 d_alloc - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc870 -1 d_alloc - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc960 -1 d_alloc + K 1304 1312 000000d0 0xffff8800791d7340 -1 reiserfs_alloc_inode - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname - C 0xffff88007cad6000 putname + K 992 1000 000000d0 0xffff880079045b58 -1 alloc_inode + K 768 1024 000080d0 0xffff88007c096400 -1 alloc_pipe_info + K 240 240 000000d0 0xffff8800790dca50 -1 d_alloc + K 272 320 000080d0 0xffff88007c088780 -1 get_empty_filp + K 272 320 000080d0 0xffff88007c088000 -1 get_empty_filp Yeah I shall confess kmem_minimalistic should be: kmem_alternative. Whatever, I find it more readable but this a personal opinion of course. We can drop it if you want. On the ALLOC/FREE column, + means an allocation and - a free. On the type column, you have K = kmalloc, C = cache, P = page I would like the flags to be GFP_* strings but that would not be easy to not break the column with strings.... About the node...it seems to always be -1. I don't know why but that shouldn't be difficult to find. I moved linux/tracepoint.h to trace/tracepoint.h as well. I think that would be more easy to find the tracer headers if they are all in their common directory. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | kmemtrace: SLUB hooks.Eduard - Gabriel Munteanu2008-12-291-3/+50
| |/ | | | | | | | | | | | | This adds hooks for the SLUB allocator, to allow tracing with kmemtrace. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| |
| \
*-. \ Merge branches 'topic/slob/cleanups', 'topic/slob/fixes', 'topic/slub/core', ↵Pekka Enberg2009-03-241-4/+17
|\ \ \ | |_|/ |/| | | | | 'topic/slub/cleanups' and 'topic/slub/perf' into for-linus
| | * SLUB: Do not pass 8k objects through to the page allocatorPekka Enberg2009-02-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Increase the maximum object size in SLUB so that 8k objects are not passed through to the page allocator anymore. The network stack uses 8k objects for performance critical operations. The patch is motivated by a SLAB vs. SLUB regression in the netperf benchmark. The problem is that the kfree(skb->head) call in skb_release_data() that is subject to page allocator pass-through as the size passed to __alloc_skb() is larger than 4 KB in this test. As explained by Yanmin Zhang: I use 2.6.29-rc2 kernel to run netperf UDP-U-4k CPU_NUM client/server pair loopback testing on x86-64 machines. Comparing with SLUB, SLAB's result is about 2.3 times of SLUB's. After applying the reverting patch, the result difference between SLUB and SLAB becomes 1% which we might consider as fluctuation. [ penberg@cs.helsinki.fi: fix oops in kmalloc() ] Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| | * SLUB: Introduce and use SLUB_MAX_SIZE and SLUB_PAGE_SHIFT constantsChristoph Lameter2009-02-201-3/+16
| |/ |/| | | | | | | | | | | | | | | | | | | As a preparational patch to bump up page allocator pass-through threshold, introduce two new constants SLUB_MAX_SIZE and SLUB_PAGE_SHIFT and convert mm/slub.c to use them. Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| * slub: move min_partial to struct kmem_cacheDavid Rientjes2009-02-231-1/+1
|/ | | | | | | | | Although it allows for better cacheline use, it is unnecessary to save a copy of the cache's min_partial value in each kmem_cache_node. Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>