diff options
author | Huang Ying <ying.huang@intel.com> | 2009-06-15 15:37:07 +0800 |
---|---|---|
committer | H. Peter Anvin <hpa@zytor.com> | 2009-06-16 16:56:04 -0700 |
commit | 184e1fdfea066ab8f12a1e8912f402d2d6556d11 (patch) | |
tree | f9ffff79d1924f530e582a2aab9f0cb032f0a4e3 /arch | |
parent | 300df7dc89cc276377fc020704e34875d5c473b6 (diff) | |
download | linux-184e1fdfea066ab8f12a1e8912f402d2d6556d11.tar.gz linux-184e1fdfea066ab8f12a1e8912f402d2d6556d11.tar.bz2 linux-184e1fdfea066ab8f12a1e8912f402d2d6556d11.zip |
x86, mce: fix a race condition about mce_callin and no_way_out
If one CPU has no_way_out == 1, all other CPUs should have no_way_out
== 1. But despite global_nwo is read after mce_callin, global_nwo is
updated after mce_callin too. So it is possible that some CPU read
global_nwo before some other CPU update global_nwo, so that no_way_out
== 1 for some CPU, while no_way_out == 0 for some other CPU.
This patch fixes this race condition via moving mce_callin updating
after global_nwo updating, with a smp_wmb in between. A smp_rmb is
added between their reading too.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Diffstat (limited to 'arch')
-rw-r--r-- | arch/x86/kernel/cpu/mcheck/mce.c | 12 |
1 files changed, 10 insertions, 2 deletions
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index fabba15e4558..19294b8524cb 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -703,6 +703,11 @@ static int mce_start(int no_way_out, int *order) } atomic_add(no_way_out, &global_nwo); + /* + * global_nwo should be updated before mce_callin + */ + smp_wmb(); + *order = atomic_add_return(1, &mce_callin); /* * Wait for everyone. @@ -717,6 +722,10 @@ static int mce_start(int no_way_out, int *order) } /* + * mce_callin should be read before global_nwo + */ + smp_rmb(); + /* * Cache the global no_way_out state. */ nwo = atomic_read(&global_nwo); @@ -862,7 +871,7 @@ void do_machine_check(struct pt_regs *regs, long error_code) * Establish sequential order between the CPUs entering the machine * check handler. */ - int order; + int order = -1; /* * If no_way_out gets set, there is no safe way to recover from this @@ -887,7 +896,6 @@ void do_machine_check(struct pt_regs *regs, long error_code) if (!banks) goto out; - order = atomic_add_return(1, &mce_callin); mce_setup(&m); m.mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS); |