Created attachment 1191 [details] 崩溃截图 在有intel_th设备的服务器上,执行reboot 1000次操作,会偶发启动崩溃问题。 现象: 1、每次崩溃都有intel_th_pci驱动加载报错,见下面信息 2、崩溃的位置每次都不一样,但是都跟内存相关 (见附件) 3、禁用intel_th_pci驱动后,不会再发生崩溃问题 [ 18.546377] intel_th_pci 0000:00:1f.7: enabling device (0140 -> 0142) [ 18.557952] sysfs: cannot create duplicate filename '/dev/char/236:2' [ 18.564544] CPU: 2 PID: 1034 Comm: modprobe Tainted: G OE 4.19.91-26.6.7.kos5.x86_64 #1 [ 18.568290] ioatdma 0000:00:01.4: enabling device (0144 -> 0146) [ 18.573897] Hardware name: IEITSYSTEMS Qingqiu/Qingqiu, BIOS 1.02.08 03/21/2024 [ 18.573898] Call Trace: [ 18.573914] dump_stack+0x66/0x90 [ 18.573924] sysfs_warn_dup.cold.0+0x17/0x23 [ 18.597930] sysfs_do_create_link_sd.isra.0+0xaa/0xd0 [ 18.600161] ioatdma 0000:00:01.5: enabling device (0144 -> 0146) [ 18.603121] device_add+0x5d5/0x690 [ 18.603131] intel_th_subdevice_alloc+0x385/0x450 [intel_th] [ 18.618679] ? add_dr+0x35/0x60 [ 18.621955] ? add_dr+0x35/0x60 [ 18.625233] intel_th_output_enable+0xaf/0xe0 [intel_th] [ 18.630680] intel_th_gth_probe+0x216/0x4a0 [intel_th_gth] [ 18.636302] ? kernfs_remove_by_name_ns+0x5b/0x90 [ 18.641142] intel_th_probe+0x69/0x120 [intel_th] [ 18.645983] really_probe+0x23e/0x390 [ 18.649780] driver_probe_device+0xd1/0x110 [ 18.654096] __driver_attach+0xea/0x110 [ 18.658063] ? driver_probe_device+0x110/0x110 [ 18.662638] ? driver_probe_device+0x110/0x110 [ 18.667216] bus_for_each_dev+0x63/0x90 [ 18.671185] bus_add_driver+0x152/0x230 [ 18.675158] ? 0xffffffffc024d000 [ 18.678607] driver_register+0x6b/0xb0 [ 18.682491] ? 0xffffffffc024d000 [ 18.685944] do_one_initcall+0x46/0x1c3 [ 18.689914] ? free_unref_page_commit+0x9b/0x110 [ 18.694669] ? kmem_cache_alloc_trace+0x33/0x190 [ 18.699421] ? do_init_module+0x22/0x210 [ 18.703477] do_init_module+0x5a/0x210 [ 18.707362] load_module+0x1519/0x15d0 [ 18.711250] ? __se_sys_finit_module+0x82/0xc0 [ 18.715829] __se_sys_finit_module+0x82/0xc0 [ 18.720235] do_syscall_64+0x5f/0x1b0 [ 18.723551] ioatdma 0000:00:01.6: enabling device (0144 -> 0146) [ 18.724034] ? prepare_exit_to_usermode+0x4c/0xb0 [ 18.724042] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 18.740199] RIP: 0033:0x7f6c15a3991d [ 18.743323] ioatdma 0000:00:01.7: enabling device (0144 -> 0146) [ 18.743913] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3b 55 38 00 f7 d8 64 89 01 48 [ 18.743916] RSP: 002b:00007ffeec354968 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 18.776652] RAX: ffffffffffffffda RBX: 000055e1ce597c80 RCX: 00007f6c15a3991d [ 18.783916] RDX: 0000000000000000 RSI: 000055e1cd2338b6 RDI: 0000000000000000 [ 18.791184] RBP: 000055e1cd2338b6 R08: 0000000000000000 R09: 0000000000000000 [ 18.798448] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 18.805713] R13: 000055e1ce597f70 R14: 0000000000040000 R15: 0000000000000000
Created attachment 1192 [details] 崩溃截图