Skip to content

Commit 233756a

Browse files
ftang1paulmckrcu
authored andcommitted
x86/tsc: Extend watchdog check exemption to 4-Sockets platform
There were reports again that the tsc clocksource on 4 sockets x86 servers was wrongly judged as 'unstable' by 'jiffies' and other watchdogs, and disabled [1][2]. Commit b50db70 ("x86/tsc: Disable clocksource watchdog for TSC on qualified platorms") was introduce to deal with these false alarms of tsc unstable issues, covering qualified platforms for 2 sockets or smaller ones. And from history of chasing TSC issues, Thomas and Peter only saw real TSC synchronization issue on 8 socket machines. So extend the exemption to 4 sockets to fix the issue. Rui also proposed another way to disable 'jiffies' as clocksource watchdog [3], which can also solve problem in [1]. in an architecture independent way, but can't cure the problem in [2]. whose watchdog is HPET or PMTIMER, while 'jiffies' is mostly used as watchdog in boot phase. 'nr_online_nodes' has known inaccurate problem for cases like platform with cpu-less memory nodes, sub numa cluster enabled, fakenuma, kernel cmdline parameter 'maxcpus=', etc. The harmful case is the 'maxcpus' one which could possibly under estimates the package number, and disable the watchdog, but bright side is it is mostly for debug usage. All these will be addressed in other patches, as discussed in thread [4]. [1]. https://lore.kernel.org/all/9d3bf570-3108-0336-9c52-9bee15767d29@huawei.com/ [2]. https://lore.kernel.org/lkml/06df410c-2177-4671-832f-339cff05b1d9@paulmck-laptop/ [3]. https://lore.kernel.org/all/bd5b97f89ab2887543fc262348d1c7cafcaae536.camel@intel.com/ [4]. https://lore.kernel.org/all/20221021062131.1826810-1-feng.tang@intel.com/ Reported-by: Yu Liao <liaoyu15@huawei.com> Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
1 parent e40806e commit 233756a

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

arch/x86/kernel/tsc.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1258,7 +1258,7 @@ static void __init check_system_tsc_reliable(void)
12581258
if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
12591259
boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
12601260
boot_cpu_has(X86_FEATURE_TSC_ADJUST) &&
1261-
nr_online_nodes <= 2)
1261+
nr_online_nodes <= 4)
12621262
tsc_disable_clocksource_watchdog();
12631263
}
12641264

0 commit comments

Comments
 (0)