summaryrefslogtreecommitdiffstats
path: root/kernel/sched_cpupri.c
Commit message (Collapse)AuthorAgeFilesLines
* sched: Add new prio to cpupri before removing old prioSteven Rostedt2009-08-021-14/+16
| | | | | | | | | | | | | | | | | We need to add the new prio to the cpupri accounting before removing the old prio. This is because removing the old prio first will open a race window where the cpu will be removed from pri_active. In this case the cpu will not be visible for RT push and pulls. This could cause a RT task to not migrate appropriately, and create a very large latency. This bug was found with the use of ftrace sched events and trace_printk. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090729042526.438281019@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Fix race in cpupri introduced by cpumask_var changesGregory Haskins2009-08-021-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Background: Several race conditions in the scheduler have cropped up recently, which Steven and I have tracked down using ftrace. The most recent one turns out to be a race in how the scheduler determines a suitable migration target for RT tasks, introduced recently with commit: commit 68e74568fbe5854952355e942acca51f138096d9 Date: Tue Nov 25 02:35:13 2008 +1030 sched: convert struct cpupri_vec cpumask_var_t. The original design of cpupri allowed lockless readers to quickly determine a best-estimate target. Races between the pri_active bitmap and the vec->mask were handled in the original code because we would detect and return "0" when this occured. The design was predicated on the *effective* atomicity (*) of caching the result of cpus_and() between the cpus_allowed and the vec->mask. Commit 68e74568 changed the behavior such that vec->mask is accessed multiple times. This introduces a subtle race, the result of which means we can have a result that returns "1", but with an empty bitmap. *) yes, we know cpus_and() is not a locked operator across the entire composite array, but it is implicitly atomic on a per-word basis which is all the design required to work. Implementation: Rather than forgoing the lockless design, or reverting to a stack-based cpumask_t, we simply check for when the race has been encountered and continue processing in the event that the race is hit. This renders the removal race as if the priority bit had been atomically cleared as well, and allows the algorithm to execute correctly. Signed-off-by: Gregory Haskins <ghaskins@novell.com> CC: Rusty Russell <rusty@rustcorp.com.au> CC: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090730145728.25226.92769.stgit@dev.haskins.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Remove unneeded __ref tagLi Zefan2009-06-171-1/+1
| | | | | | | | | | Those two functions no longer call alloc_bootmmem_cpumask_var(), so no need to tag them with __init_refok. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> LKML-Reference: <4A35DD5B.9050106@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: use slab in cpupri_init()Pekka Enberg2009-06-111-3/+5
| | | | | | | | | | Lets not use the bootmem allocator in cpupri_init() as slab is already up when it is run. Cc: Ingo Molnar <mingo@elte.hu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* cpumask: alloc zeroed cpumask for static cpumask_var_tsYinghai Lu2009-06-091-1/+1
| | | | | | | | These are defined as static cpumask_var_t so if MAXSMP is not used, they are cleared already. Avoid surprises when MAXSMP is enabled. Signed-off-by: Yinghai Lu <yinghai.lu@kernel.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* sched_rt: don't allocate cpumask in fastpathRusty Russell2009-04-011-2/+3
| | | | | | | | | | | | Impact: cleanup As pointed out by Steven Rostedt. Since the arg in question is unused, we simply change cpupri_find() to accept NULL. Reported-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> LKML-Reference: <200903251501.22664.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: fix section mismatchLi Zefan2009-01-061-1/+1
| | | | | | | | init_rootdomain() calls alloc_bootmem_cpumask_var() at system boot, so does cpupri_init(). Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: convert struct cpupri_vec cpumask_var_t.Rusty Russell2008-11-241-11/+28
| | | | | | | | | | | | | | | Impact: stack usage reduction, (future) size reduction for large NR_CPUS. Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. The fact cpupro_init is called both before and after the slab is available makes for an ugly parameter unfortunately. We also use cpumask_any_and to get rid of a temporary in cpupri_find. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: use a 2-d bitmap for searching lowest-pri CPUGregory Haskins2008-06-061-0/+174
The current code use a linear algorithm which causes scaling issues on larger SMP machines. This patch replaces that algorithm with a 2-dimensional bitmap to reduce latencies in the wake-up path. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>