Description of problem: watchdog_thresh设置为60时,需要200秒以上才触发softlockup Version-Release number of selected component (if applicable): Linux localhost.localdomain 5.10.134-18.an8.x86_64 #1 SMP Fri Dec 13 16:32:58 CST 2024 x86_64 x86_64 x86_64 GNU/Linux How reproducible: Steps to Reproduce: 1.echo 1 > /proc/sys/kernel/watchdog 2.echo 60 > /proc/sys/kernel/watchdog_thresh 3.echo 1 > /proc/sys/kernel/softlockup_panic 4.注入内核态死循环故障 Actual results: 200秒以上触发soft lockup Expected results: 120秒左右触发softlockup复位 Additional info: 故障注入代码: #include <linux/module.h> #include <linux/kernel.h> #include <linux/kthread.h> #include <linux/sched.h> #include <linux/delay.h> #include <linux/cpumask.h> #include <linux/timekeeping.h> static struct task_struct *my_thread; static int cpu_id; static int delay_time; module_param(cpu_id, int ,0644); module_param(delay_time, int ,0644); static int my_thread_fn(void *data) { ktime_t start, end; s64 elapsed_ns; pr_info("[my_thread] running on CPU %d\n", smp_processor_id()); while (!kthread_should_stop()) { } return 0; } static int __init my_module_init(void) { pr_info("[my_module] init\n"); my_thread = kthread_create(my_thread_fn, NULL, "my_kthread"); if (IS_ERR(my_thread)) { pr_err("[my_module] Failed to create kthread\n"); return PTR_ERR(my_thread); } wake_up_process(my_thread); return 0; } static void __exit my_module_exit(void) { pr_info("[my_module] exit\n"); if (my_thread) kthread_stop(my_thread); } module_init(my_module_init); module_exit(my_module_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("ChatGPT"); MODULE_DESCRIPTION("cpu loop update demo with timing");
[ 248.899325] [my_module] init [ 248.899371] [my_thread] running on CPU 0 [ 414.615575] watchdog: BUG: soft lockup - CPU#0 stuck for 157s! [my_kthread:3225] [ 414.616225] CPU#0 Utilization every 24s during lockup: [ 414.616635] #1: 100% system, 0% softirq, 1% hardirq, 0% idle [ 414.617142] #2: 100% system, 0% softirq, 1% hardirq, 0% idle [ 414.617638] #3: 100% system, 0% softirq, 1% hardirq, 0% idle [ 414.618142] #4: 100% system, 0% softirq, 1% hardirq, 0% idle [ 414.618635] #5: 100% system, 0% softirq, 1% hardirq, 0% idle 我这测下来是比120s要长点,但是也没到200s的地步啊
代码网站: https://mirrors.aliyun.com/anolis/8/kernel-5.10/source/Packages/?spm=a2c6h.25603864.0.0.349a715fSNXlOX 下载rpm包: kernel-5.10.134-18.an8.src.rpm 使用编译配置: kernel-5.10.134-x86_64.config 安装该内核后,使用前面提供的配置: 1.echo 1 > /proc/sys/kernel/watchdog 2.echo 60 > /proc/sys/kernel/watchdog_thresh 3.echo 1 > /proc/sys/kernel/softlockup_panic 注入软狗故障能看到200秒以上的时延;请确认一下问题