Bug 8040 - kubelet 读取 blkio.bfq.io_serviced_recursive 导致 Kernel panic - not syncing: Hard LOCKUP
Summary: kubelet 读取 blkio.bfq.io_serviced_recursive 导致 Kernel panic - not syncing: Har...
Status: RESOLVED FIXED
Alias: None
Product: Anolis OS 8
Classification: Anolis OS
Component: kernel - anck-4.19 (show other bugs) kernel - anck-4.19
Version: ---
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: maqiao_mq
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-01-24 17:46 UTC by fang1037
Modified: 2024-02-08 20:26 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description fang1037 2024-01-24 17:46:17 UTC
Description of problem:

生产环境机器偶发机器疑似死锁导致宕机情况,vmcore 信息如下:


WARNING: kernel relocated [80MB]: patching 97028 gdb minimal_symbol values

      KERNEL: usr/lib/debug/lib/modules/4.19.91-27.1.an7.x86_64/vmlinux
    DUMPFILE: /home/caas/vmcore  [PARTIAL DUMP]
        CPUS: 112
        DATE: Fri Dec 29 12:36:23 2023
      UPTIME: 135 days, 20:07:49
LOAD AVERAGE: 51.86, 40.13, 36.83
       TASKS: 5126
    NODENAME: -----
     RELEASE: 4.19.91-27.1.an7.x86_64
     VERSION: #1 SMP Tue Feb 21 11:27:29 CST 2023
     MACHINE: x86_64  (2000 Mhz)
      MEMORY: 383.4 GB
       PANIC: "Kernel panic - not syncing: Hard LOCKUP"
         PID: 18068
     COMMAND: "kubelet"
        TASK: ffff96ebb63f8000  [THREAD_INFO: ffff96ebb63f8000]
         CPU: 107
       STATE: TASK_RUNNING (PANIC)


crash> bt
PID: 18068  TASK: ffff96ebb63f8000  CPU: 107  COMMAND: "kubelet"
 #0 [fffffe000126ca50] machine_kexec at ffffffff86064de8
 #1 [fffffe000126caa0] __crash_kexec at ffffffff8614b62a
 #2 [fffffe000126cb60] panic at ffffffff860a2423
 #3 [fffffe000126cbd8] nmi_panic at ffffffff860a1cb7
 #4 [fffffe000126cbe0] watchdog_hardlockup_check at ffffffff8617f80d
 #5 [fffffe000126cbf0] __perf_event_overflow at ffffffff861eab81
 #6 [fffffe000126cc20] handle_pmi_common at ffffffff8600e129
 #7 [fffffe000126ce08] intel_pmu_handle_irq at ffffffff8600e27d
 #8 [fffffe000126ce50] perf_event_nmi_handler at ffffffff860061ce
 #9 [fffffe000126ce68] nmi_handle at ffffffff8602b8fe
#10 [fffffe000126ceb8] default_do_nmi at ffffffff8602bdae
#11 [fffffe000126ced0] do_nmi at ffffffff8602bf87
#12 [fffffe000126cef0] end_repeat_nmi at ffffffff86a0162d
    [exception RIP: queued_spin_lock_slowpath+268]
    RIP: ffffffff86100d0c  RSP: ffffa91e0d1abe18  RFLAGS: 00000046
    RAX: 0000000000000000  RBX: ffffffff87390b20  RCX: 0000000001b00000
    RDX: ffff96ecafae3780  RSI: ffff96bcafc63780  RDI: ffff96ec9f0881c4
    RBP: 0000000000000000   R8: 0000000000000198   R9: 0000000000000001
    R10: ffff96bcaf006b80  R11: 0000000000000000  R12: ffff96b2ab5fa500
    R13: ffffffff864799c0  R14: 0000000000000198  R15: ffff96bc6d22f800
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
#13 [ffffa91e0d1abe18] queued_spin_lock_slowpath at ffffffff86100d0c
#14 [ffffa91e0d1abe18] blkcg_print_blkgs at ffffffff86479488
#15 [ffffa91e0d1abe58] blkg_print_stat_ios_recursive at ffffffff8647976f
#16 [ffffa91e0d1abe70] seq_read at ffffffff862e24ba
#17 [ffffa91e0d1abed0] vfs_read at ffffffff862bcec9
#18 [ffffa91e0d1abf00] ksys_read at ffffffff862bd2fa
#19 [ffffa91e0d1abf38] do_syscall_64 at ffffffff86003e4b
#20 [ffffa91e0d1abf50] entry_SYSCALL_64_after_hwframe at ffffffff86a0009c
    RIP: 00000000022d3d3b  RSP: 000000c002eed650  RFLAGS: 00000206
    RAX: ffffffffffffffda  RBX: 000000c00006a800  RCX: 00000000022d3d3b
    RDX: 0000000000001000  RSI: 000000c00287c000  RDI: 000000000000009c
    RBP: 000000c002eed6a0   R8: 0000000000000001   R9: 0000000000000002
    R10: 0000000000000000  R11: 0000000000000206  R12: 0000000000000000
    R13: 0000000000000002  R14: 0000000000000002  R15: 0000000000000002
    ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b


部分 dmesg 信息:

[11670366.594469] predict_server_[86385]: segfault at 40 ip 000055a7e02a1ad0 sp 00007fffa66d25b0 error 4 in predict_server_v9[55a7e014f000+a39000]
[11670366.594478] Code: e4 48 8b 45 e8 48 83 c0 40 48 89 45 f8 c7 45 f0 00 00 00 00 8b 45 f0 be ff ff 00 00 89 c7 e8 54 87 ff ff 89 45 f4 48 8b 45 f8 <8b> 00 39 45 e4 0f 9d c0 c9 c3 f3 0f 1e fa 55 48 89 e5 48 83 ec 20
[11670483.645717] IPv6: ADDRCONF(NETDEV_UP): hybr7d081c9b942: link is not ready
[11670483.649475] IPv6: ADDRCONF(NETDEV_CHANGE): hybr7d081c9b942: link becomes ready
[11736284.310788] ------------[ cut here ]------------
[11736284.310790] NETDEV WATCHDOG: p17p1 (ice): transmit queue 53 timed out
[11736284.310808] WARNING: CPU: 47 PID: 66130 at net/sched/sch_generic.c:466 dev_watchdog+0x1f1/0x200
[11736284.310808] Modules linked in: act_pedit(E) cls_u32(E) sch_prio(E) sch_htb(E) sch_dsmark(E) sch_sfq(E) ali_professor(OE) aqos(OE) aqos_hotfixes(OE) veth(E) ip6t_REJECT(E) nf_reject_ipv6(E) ipt_REJECT(E) nf_reject_ipv4(E) vxlan(E) ip6_udp_tunnel(E) udp_tunnel(E) ipt_rpfilter(E) ip6t_rpfilter(E) xt_multiport(E) iptable_raw(E) ip6table_raw(E) ip_set_hash_ip(E) ip_set_hash_net(E) ip_set_hash_ipportnet(E) ip_set_hash_ipportip(E) ip_set_bitmap_port(E) ip_set_hash_ipport(E) dummy(E) nf_tables(E) ip6table_mangle(E) ip6t_MASQUERADE(E) ip6table_filter(E) iptable_mangle(E) ip6table_nat(E) nf_nat_ipv6(E) xt_comment(E) xt_mark(E) ip6_tables(E) xt_set(E) ip_set(E) ip_vs_sh(E) ip_vs_wrr(E) ip_vs_rr(E) ip_vs(E) xt_conntrack(E) ipt_MASQUERADE(E) nf_conntrack_netlink(E) nfnetlink(E) xt_addrtype(E) iptable_filter(E)
[11736284.310831]  iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bpfilter(E) br_netfilter(E) bridge(E) stp(E) llc(E) overlay(E) binfmt_misc(E) scheduler(OE) tcp_diag(E) inet_diag(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) intel_rapl_msr(E) intel_rapl_common(E) i10nm_edac(E) nfit(E) iTCO_wdt(E) iTCO_vendor_support(E) x86_pkg_temp_thermal(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) mei_me(E) joydev(E) isst_if_mbox_pci(E) ioatdma(E) isst_if_mmio(E) aesni_intel(E) pcspkr(E) sg(E) mousedev(E) isst_if_common(E) mei(E) i2c_i801(E) dca(E) glue_helper(E) wmi(E) pcc_cpufreq(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) acpi_power_meter(E) acpi_cpufreq(E) auth_rpcgss(E) sunrpc(E) ip_tables(E)
[11736284.310857]  xfs(E) libcrc32c(E) sd_mod(E) ast(E) i2c_algo_bit(E) crc32c_intel(E) ttm(E) ice(OE) megaraid_sas(E) drm_kms_helper(E) syscopyarea(E) intel_auxiliary(OE) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) ahci(E) drm(E) libahci(E) i2c_core(E) libata(E)
[11736284.310868] CPU: 47 PID: 66130 Comm: titanagent Kdump: loaded Tainted: G           OE     4.19.91-27.1.an7.x86_64 #1
[11736284.310869] Hardware name: Inspur SA5280M6/SA5280M6, BIOS 06.03.00 03/08/2023
[11736284.310871] RIP: 0010:dev_watchdog+0x1f1/0x200
[11736284.310872] Code: 63 45 e0 eb 91 4c 89 e7 c6 05 7b d4 c6 00 01 e8 25 56 fc ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 78 36 17 87 31 c0 e8 ff a1 8d ff <0f> 0b eb be 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 c7
[11736284.310873] RSP: 0000:ffff96ecaf2c3ea0 EFLAGS: 00010286
[11736284.310874] RAX: 0000000000000039 RBX: 0000000000000035 RCX: 0000000000000006
[11736284.310875] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff96ecaf2d6bc0
[11736284.310876] RBP: ffff96eca00c4480 R08: 000000000000004c R09: 0000000000523370
[11736284.310877] R10: ffffffff87b0a4e4 R11: 000000000000e4c1 R12: ffff96eca00c4000
[11736284.310877] R13: 000000000000002f R14: 0000000000000001 R15: 0000000000000000
[11736284.310878] FS:  00007f6d790ae700(0000) GS:ffff96ecaf2c0000(0000) knlGS:0000000000000000
[11736284.310879] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11736284.310880] CR2: 00007f6d98e27000 CR3: 0000005f9161a003 CR4: 0000000000770ee0
[11736284.310881] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[11736284.310882] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[11736284.310882] PKRU: 55555554
[11736284.310882] Call Trace:
[11736284.310885]  <IRQ>
[11736284.310888]  ? dev_graft_qdisc+0x50/0x50
[11736284.310891]  call_timer_fn+0x2d/0x140
[11736284.310893]  expire_timers+0x93/0xf0
[11736284.310894]  run_timer_softirq+0x73/0x140
[11736284.310896]  ? __hrtimer_run_queues+0x11b/0x250
[11736284.310897]  ? ktime_get+0x37/0xa0
[11736284.310900]  __do_softirq+0xd1/0x28c
[11736284.310902]  irq_exit+0xd3/0xf0
[11736284.310904]  smp_apic_timer_interrupt+0x74/0x140
[11736284.310906]  apic_timer_interrupt+0xf/0x20
[11736284.310907]  </IRQ>
[11736284.310911] RIP: 0033:0x8f8cc7
[11736284.310912] Code: 5c 44 89 f0 44 21 d1 c1 c0 0a 01 d7 44 89 f2 01 fe 44 89 d7 f7 d7 23 7c 24 10 c1 c2 1e 31 cf 44 89 f1 01 f7 44 89 ee 44 31 fe <46> 8d 24 27 44 21 f6 c1 c1 13 41 31 f1 31 ca 48 83 c3 20 31 d0 83
[11736284.310912] RSP: 002b:00007f6d790a4740 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
[11736284.310913] RAX: 00000000ed80ab57 RBX: 0000000000da21e0 RCX: 00000000d5fb602a
[11736284.310914] RDX: 00000000b57ed80a RSI: 00000000926ea4f5 RDI: 00000000a3c50c4d
[11736284.310915] RBP: 0000000000000030 R08: 0000000000da21e0 R09: 000000006c804000
[11736284.310915] R10: 00000000363127cc R11: 0000000000000000 R12: 000000003a878a27
[11736284.310916] R13: 00000000ec804425 R14: 00000000d5fb602a R15: 000000007eeee0d0
[11736284.310917] ---[ end trace 19b6f237bee76adf ]---
[11736284.310946] ice 0000:98:00.0 p17p1: tx_timeout: VSI_num: 6, Q 53, NTC: 0x29, HW_HEAD: 0x11, NTU: 0x12, INT: 0x4000000
[11736284.310947] ice 0000:98:00.0 p17p1: tx_timeout recovery level 1, txqueue 53
[11736289.429740] ice 0000:98:00.0 p17p1: tx_timeout: VSI_num: 6, Q 53, NTC: 0x29, HW_HEAD: 0x11, NTU: 0x12, INT: 0x4000000
[11736289.429743] ice 0000:98:00.0 p17p1: tx_timeout recovery level 2, txqueue 53
[11736292.038919] NMI watchdog: Watchdog detected hard LOCKUP on cpu 107
[11736292.038920] Modules linked in: act_pedit(E) cls_u32(E) sch_prio(E) sch_htb(E) sch_dsmark(E) sch_sfq(E) ali_professor(OE) aqos(OE) aqos_hotfixes(OE) veth(E) ip6t_REJECT(E) nf_reject_ipv6(E) ipt_REJECT(E) nf_reject_ipv4(E) vxlan(E) ip6_udp_tunnel(E) udp_tunnel(E) ipt_rpfilter(E) ip6t_rpfilter(E) xt_multiport(E) iptable_raw(E) ip6table_raw(E) ip_set_hash_ip(E) ip_set_hash_net(E) ip_set_hash_ipportnet(E) ip_set_hash_ipportip(E) ip_set_bitmap_port(E) ip_set_hash_ipport(E) dummy(E) nf_tables(E) ip6table_mangle(E) ip6t_MASQUERADE(E) ip6table_filter(E) iptable_mangle(E) ip6table_nat(E) nf_nat_ipv6(E) xt_comment(E) xt_mark(E) ip6_tables(E) xt_set(E) ip_set(E) ip_vs_sh(E) ip_vs_wrr(E) ip_vs_rr(E) ip_vs(E) xt_conntrack(E) ipt_MASQUERADE(E) nf_conntrack_netlink(E) nfnetlink(E) xt_addrtype(E) iptable_filter(E)
[11736292.038929]  iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bpfilter(E) br_netfilter(E) bridge(E) stp(E) llc(E) overlay(E) binfmt_misc(E) scheduler(OE) tcp_diag(E) inet_diag(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) intel_rapl_msr(E) intel_rapl_common(E) i10nm_edac(E) nfit(E) iTCO_wdt(E) iTCO_vendor_support(E) x86_pkg_temp_thermal(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) mei_me(E) joydev(E) isst_if_mbox_pci(E) ioatdma(E) isst_if_mmio(E) aesni_intel(E) pcspkr(E) sg(E) mousedev(E) isst_if_common(E) mei(E) i2c_i801(E) dca(E) glue_helper(E) wmi(E) pcc_cpufreq(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) acpi_power_meter(E) acpi_cpufreq(E) auth_rpcgss(E) sunrpc(E) ip_tables(E)
[11736292.038938]  xfs(E) libcrc32c(E) sd_mod(E) ast(E) i2c_algo_bit(E) crc32c_intel(E) ttm(E) ice(OE) megaraid_sas(E) drm_kms_helper(E) syscopyarea(E) intel_auxiliary(OE) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) ahci(E) drm(E) libahci(E) i2c_core(E) libata(E)
[11736292.038941] CPU: 107 PID: 18068 Comm: kubelet Kdump: loaded Tainted: G        W  OE     4.19.91-27.1.an7.x86_64 #1
[11736292.038941] Hardware name: Inspur SA5280M6/SA5280M6, BIOS 06.03.00 03/08/2023
[11736292.038942] RIP: 0010:queued_spin_lock_slowpath+0x10c/0x1b0
[11736292.038942] Code: 48 c1 ee 0c 83 e8 01 83 e6 30 48 98 48 81 c6 80 37 02 00 48 03 34 c5 a0 17 18 87 48 89 16 8b 42 08 85 c0 75 09 f3 90 8b 42 08 <85> c0 74 f7 48 8b 32 48 85 f6 74 07 0f 0d 0e eb 02 f3 90 8b 07 66
[11736292.038943] RSP: 0018:ffffa91e0d1abe18 EFLAGS: 00000046
[11736292.038943] RAX: 0000000000000000 RBX: ffffffff87390b20 RCX: 0000000001b00000
[11736292.038943] RDX: ffff96ecafae3780 RSI: ffff96bcafc63780 RDI: ffff96ec9f0881c4
[11736292.038944] RBP: 0000000000000000 R08: 0000000000000198 R09: 0000000000000001
[11736292.038944] R10: ffff96bcaf006b80 R11: 0000000000000000 R12: ffff96b2ab5fa500
[11736292.038944] R13: ffffffff864799c0 R14: 0000000000000198 R15: ffff96bc6d22f800
[11736292.038945] FS:  00007f4a4bfff700(0000) GS:ffff96ecafac0000(0000) knlGS:0000000000000000
[11736292.038945] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11736292.038945] CR2: 00007f1e0010e000 CR3: 0000005f72d32004 CR4: 0000000000770ee0
[11736292.038946] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[11736292.038946] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[11736292.038946] PKRU: 55555554
[11736292.038946] Call Trace:
[11736292.038947]  blkcg_print_blkgs+0x68/0xd0
[11736292.038947]  blkg_print_stat_ios_recursive+0x3f/0x50
[11736292.038947]  seq_read+0x14a/0x3e0
[11736292.038947]  vfs_read+0x89/0x130
[11736292.038948]  ksys_read+0x4a/0xc0
[11736292.038948]  do_syscall_64+0x5b/0x1d0
[11736292.038948]  entry_SYSCALL_64_after_hwframe+0x5b/0xc0
[11736292.038948] RIP: 0033:0x22d3d3b
[11736292.038949] Code: fe ff eb bd e8 e6 e7 fd ff e9 61 ff ff ff cc e8 7b b1 fd ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
[11736292.038949] RSP: 002b:000000c002eed650 EFLAGS: 00000206 ORIG_RAX: 0000000000000000
[11736292.038950] RAX: ffffffffffffffda RBX: 000000c00006a800 RCX: 00000000022d3d3b
[11736292.038950] RDX: 0000000000001000 RSI: 000000c00287c000 RDI: 000000000000009c
[11736292.038950] RBP: 000000c002eed6a0 R08: 0000000000000001 R09: 0000000000000002
[11736292.038951] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[11736292.038951] R13: 0000000000000002 R14: 0000000000000002 R15: 0000000000000002
[11736292.038951] Kernel panic - not syncing: Hard LOCKUP
[11736292.038951] CPU: 107 PID: 18068 Comm: kubelet Kdump: loaded Tainted: G        W  OE     4.19.91-27.1.an7.x86_64 #1
[11736292.038952] Hardware name: Inspur SA5280M6/SA5280M6, BIOS 06.03.00 03/08/2023
[11736292.038952] Call Trace:
[11736292.038952]  <NMI>
[11736292.038952]  dump_stack+0x66/0x8b
[11736292.038953]  panic+0xfa/0x26b
[11736292.038953]  nmi_panic+0x37/0x40
[11736292.038953]  watchdog_hardlockup_check+0xed/0x110
[11736292.038954]  __perf_event_overflow+0x51/0xe0
[11736292.038954]  handle_pmi_common+0x1c9/0x270
[11736292.038954]  ? __set_pte_vaddr+0x32/0x50
[11736292.038954]  ? __native_set_fixmap+0x24/0x30
[11736292.038955]  ? native_set_fixmap+0x35/0x60
[11736292.038955]  ? ghes_copy_tofrom_phys+0xa0/0x140
[11736292.038955]  intel_pmu_handle_irq+0xad/0x2e0
[11736292.038955]  perf_event_nmi_handler+0x2e/0x50
[11736292.038956]  nmi_handle+0x6e/0x110
[11736292.038956]  default_do_nmi+0x3e/0x100
[11736292.038956]  do_nmi+0x117/0x1a0
[11736292.038956]  end_repeat_nmi+0x16/0x65
[11736292.038957] RIP: 0010:queued_spin_lock_slowpath+0x10c/0x1b0
[11736292.038957] Code: 48 c1 ee 0c 83 e8 01 83 e6 30 48 98 48 81 c6 80 37 02 00 48 03 34 c5 a0 17 18 87 48 89 16 8b 42 08 85 c0 75 09 f3 90 8b 42 08 <85> c0 74 f7 48 8b 32 48 85 f6 74 07 0f 0d 0e eb 02 f3 90 8b 07 66
[11736292.038957] RSP: 0018:ffffa91e0d1abe18 EFLAGS: 00000046
[11736292.038958] RAX: 0000000000000000 RBX: ffffffff87390b20 RCX: 0000000001b00000
[11736292.038958] RDX: ffff96ecafae3780 RSI: ffff96bcafc63780 RDI: ffff96ec9f0881c4
[11736292.038958] RBP: 0000000000000000 R08: 0000000000000198 R09: 0000000000000001
[11736292.038959] R10: ffff96bcaf006b80 R11: 0000000000000000 R12: ffff96b2ab5fa500
[11736292.038959] R13: ffffffff864799c0 R14: 0000000000000198 R15: ffff96bc6d22f800
[11736292.038959]  ? blkg_rwstat_recursive_sum+0x160/0x160
[11736292.038960]  ? queued_spin_lock_slowpath+0x10c/0x1b0
[11736292.038960]  ? queued_spin_lock_slowpath+0x10c/0x1b0
[11736292.038960]  </NMI>
[11736292.038960]  blkcg_print_blkgs+0x68/0xd0
[11736292.038961]  blkg_print_stat_ios_recursive+0x3f/0x50
[11736292.038961]  seq_read+0x14a/0x3e0
[11736292.038961]  vfs_read+0x89/0x130
[11736292.038961]  ksys_read+0x4a/0xc0
[11736292.038962]  do_syscall_64+0x5b/0x1d0
[11736292.038962]  entry_SYSCALL_64_after_hwframe+0x5b/0xc0
[11736292.038962] RIP: 0033:0x22d3d3b
[11736292.038963] Code: fe ff eb bd e8 e6 e7 fd ff e9 61 ff ff ff cc e8 7b b1 fd ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
[11736292.038963] RSP: 002b:000000c002eed650 EFLAGS: 00000206 ORIG_RAX: 0000000000000000
[11736292.038963] RAX: ffffffffffffffda RBX: 000000c00006a800 RCX: 00000000022d3d3b
[11736292.038964] RDX: 0000000000001000 RSI: 000000c00287c000 RDI: 000000000000009c
[11736292.038964] RBP: 000000c002eed6a0 R08: 0000000000000001 R09: 0000000000000002
[11736292.038964] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[11736292.038965] R13: 0000000000000002 R14: 0000000000000002 R15: 0000000000000002
[11736292.038965] Kernel Offset: 0x5000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)



Version-Release number of selected component (if applicable):


# uname -r
4.19.91-27.1.an7.x86_64

# cat /etc/os-release
NAME="Anolis OS"
VERSION="7.9"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="7.9"
PRETTY_NAME="Anolis OS 7.9"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"
BUG_REPORT_URL="https://bugs.openanolis.cn/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"


How reproducible:

Constantly occur (but not very often) in prod environment.


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 1 Jingbo Xu alibaba_cloud_group 2024-02-06 10:18:08 UTC
堆栈信息
[11632967.777735] NMI watchdog: Watchdog detected hard LOCKUP on cpu 60
[11632967.777736] Modules linked in: act_pedit(E) cls_u32(E) sch_prio(E) fuse(E) sch_htb(E) sch_dsmark(E) sch_sfq(E) ali_professor(OE) aqos(OE) aqos_hotfixes(OE) ip6t_REJECT(E) nf_reject_ipv6(E) ipt_REJECT(E) nf_reject_ipv4(E) vxlan(E) ip6_udp_tunnel(E) udp_tunnel(E) veth(E) ip6t_rpfilter(E) xt_multiport(E) ipt_rpfilter(E) iptable_raw(E) ip6table_raw(E) ip_set_hash_ip(E) ip_set_hash_net(E) ip_set_bitmap_port(E) ip_set_hash_ipportnet(E) ip_set_hash_ipportip(E) ip_set_hash_ipport(E) dummy(E) nf_tables(E) ip6table_mangle(E) ip6t_MASQUERADE(E) ip6table_filter(E) iptable_mangle(E) ip6table_nat(E) nf_nat_ipv6(E) xt_comment(E) xt_mark(E) ip6_tables(E) xt_set(E) ip_set(E) ip_vs_sh(E) ip_vs_wrr(E) ip_vs_rr(E) ip_vs(E) xt_conntrack(E) ipt_MASQUERADE(E) nf_conntrack_netlink(E) nfnetlink(E) xt_addrtype(E) iptable_filter(E)
[11632967.777745]  iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bpfilter(E) br_netfilter(E) bridge(E) stp(E) llc(E) overlay(E) binfmt_misc(E) scheduler(OE) tcp_diag(E) inet_diag(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) intel_rapl_msr(E) intel_rapl_common(E) i10nm_edac(E) nfit(E) x86_pkg_temp_thermal(E) coretemp(E) kvm_intel(E) kvm(E) iTCO_wdt(E) irqbypass(E) iTCO_vendor_support(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) glue_helper(E) pcspkr(E) sg(E) isst_if_mbox_pci(E) joydev(E) mousedev(E) isst_if_mmio(E) isst_if_common(E) mei_me(E) ioatdma(E) mei(E) i2c_i801(E) dca(E) wmi(E) pcc_cpufreq(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) acpi_power_meter(E) acpi_cpufreq(E) auth_rpcgss(E) sunrpc(E) ip_tables(E)
[11632967.777754]  xfs(E) libcrc32c(E) sd_mod(E) ast(E) crc32c_intel(E) i2c_algo_bit(E) ttm(E) ice(OE) megaraid_sas(E) drm_kms_helper(E) intel_auxiliary(OE) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) ahci(E) drm(E) libahci(E) i2c_core(E) libata(E)
[11632967.777757] CPU: 60 PID: 24228 Comm: kubelet Kdump: loaded Tainted: G           OE     4.19.91-27.1.an7.x86_64 #1
[11632967.777757] Hardware name: Inspur SA5280M6/SA5280M6, BIOS 06.00.01 10/27/2021
[11632967.777758] RIP: 0010:queued_spin_lock_slowpath+0x109/0x1b0
[11632967.777758] Code: c1 e8 12 48 c1 ee 0c 83 e8 01 83 e6 30 48 98 48 81 c6 80 37 02 00 48 03 34 c5 a0 17 18 95 48 89 16 8b 42 08 85 c0 75 09 f3 90 <8b> 42 08 85 c0 74 f7 48 8b 32 48 85 f6 74 07 0f 0d 0e eb 02 f3 90
[11632967.777759] RSP: 0018:ffffa41d3a7c3e18 EFLAGS: 00000046
[11632967.777759] RAX: 0000000000000000 RBX: ffffffff95390b20 RCX: 0000000000f40000
[11632967.777760] RDX: ffff94aeafe23780 RSI: ffff94deafae3780 RDI: ffff94de9e1b0bcc
[11632967.777760] RBP: 0000000000000000 R08: 0000000000000198 R09: 0000000000000001
[11632967.777760] R10: ffff94ae6a184f00 R11: 0000000000000000 R12: ffff94ad15517c00
[11632967.777761] R13: ffffffff944799c0 R14: 0000000000000198 R15: ffff94ae140ed400
[11632967.777761] FS:  00007f04ee7fc700(0000) GS:ffff94aeafe00000(0000) knlGS:0000000000000000
[11632967.777761] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11632967.777762] CR2: 00007f83fcaa6010 CR3: 0000002f72d20006 CR4: 0000000000770ee0
[11632967.777762] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[11632967.777762] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[11632967.777762] PKRU: 55555554
[11632967.777763] Call Trace:
[11632967.777763]  blkcg_print_blkgs+0x68/0xd0
[11632967.777763]  blkg_print_stat_ios_recursive+0x3f/0x50
[11632967.777763]  seq_read+0x14a/0x3e0
[11632967.777764]  vfs_read+0x89/0x130
[11632967.777764]  ksys_read+0x4a/0xc0
[11632967.777764]  do_syscall_64+0x5b/0x1d0
[11632967.777764]  entry_SYSCALL_64_after_hwframe+0x5b/0xc0
[11632967.777765] RIP: 0033:0x22d3d3b
[11632967.777765] Code: fe ff eb bd e8 e6 e7 fd ff e9 61 ff ff ff cc e8 7b b1 fd ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
[11632967.777765] RSP: 002b:000000c005a4f650 EFLAGS: 00000206 ORIG_RAX: 0000000000000000
[11632967.777766] RAX: ffffffffffffffda RBX: 000000c000060800 RCX: 00000000022d3d3b
[11632967.777766] RDX: 0000000000001000 RSI: 000000c004b00000 RDI: 00000000000000a3
[11632967.777767] RBP: 000000c005a4f6a0 R08: 0000000000000001 R09: 0000000000000002
[11632967.777767] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[11632967.777767] R13: 0000000000000002 R14: 0000000000000002 R15: 0000000000000002
[11632967.777767] Kernel panic - not syncing: Hard LOCKUP
[11632967.777768] CPU: 60 PID: 24228 Comm: kubelet Kdump: loaded Tainted: G           OE     4.19.91-27.1.an7.x86_64 #1
[11632967.777768] Hardware name: Inspur SA5280M6/SA5280M6, BIOS 06.00.01 10/27/2021
[11632967.777768] Call Trace:
[11632967.777769]  <NMI>
[11632967.777769]  dump_stack+0x66/0x8b
[11632967.777769]  panic+0xfa/0x26b
[11632967.777769]  nmi_panic+0x37/0x40
[11632967.777769]  watchdog_hardlockup_check+0xed/0x110
[11632967.777770]  __perf_event_overflow+0x51/0xe0
[11632967.777770]  handle_pmi_common+0x1c9/0x270
[11632967.777770]  ? __set_pte_vaddr+0x32/0x50
[11632967.777770]  ? __native_set_fixmap+0x24/0x30
[11632967.777771]  ? native_set_fixmap+0x35/0x60
[11632967.777771]  ? ghes_copy_tofrom_phys+0xa0/0x140
[11632967.777771]  intel_pmu_handle_irq+0xad/0x2e0
[11632967.777771]  perf_event_nmi_handler+0x2e/0x50
[11632967.777772]  nmi_handle+0x6e/0x110
[11632967.777772]  default_do_nmi+0x3e/0x100
[11632967.777772]  do_nmi+0x117/0x1a0
[11632967.777772]  end_repeat_nmi+0x16/0x65
[11632967.777773] RIP: 0010:queued_spin_lock_slowpath+0x109/0x1b0
[11632967.777773] Code: c1 e8 12 48 c1 ee 0c 83 e8 01 83 e6 30 48 98 48 81 c6 80 37 02 00 48 03 34 c5 a0 17 18 95 48 89 16 8b 42 08 85 c0 75 09 f3 90 <8b> 42 08 85 c0 74 f7 48 8b 32 48 85 f6 74 07 0f 0d 0e eb 02 f3 90
[11632967.777773] RSP: 0018:ffffa41d3a7c3e18 EFLAGS: 00000046
[11632967.777774] RAX: 0000000000000000 RBX: ffffffff95390b20 RCX: 0000000000f40000
[11632967.777774] RDX: ffff94aeafe23780 RSI: ffff94deafae3780 RDI: ffff94de9e1b0bcc
[11632967.777775] RBP: 0000000000000000 R08: 0000000000000198 R09: 0000000000000001
[11632967.777775] R10: ffff94ae6a184f00 R11: 0000000000000000 R12: ffff94ad15517c00
[11632967.777775] R13: ffffffff944799c0 R14: 0000000000000198 R15: ffff94ae140ed400
[11632967.777775]  ? blkg_rwstat_recursive_sum+0x160/0x160
[11632967.777776]  ? queued_spin_lock_slowpath+0x109/0x1b0
[11632967.777776]  ? queued_spin_lock_slowpath+0x109/0x1b0
[11632967.777776]  </NMI>
[11632967.777776]  blkcg_print_blkgs+0x68/0xd0
[11632967.777777]  blkg_print_stat_ios_recursive+0x3f/0x50
[11632967.777777]  seq_read+0x14a/0x3e0
[11632967.777777]  vfs_read+0x89/0x130
[11632967.777777]  ksys_read+0x4a/0xc0
[11632967.777778]  do_syscall_64+0x5b/0x1d0
[11632967.777778]  entry_SYSCALL_64_after_hwframe+0x5b/0xc0
[11632967.777778] RIP: 0033:0x22d3d3b
[11632967.777779] Code: fe ff eb bd e8 e6 e7 fd ff e9 61 ff ff ff cc e8 7b b1 fd ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
[11632967.777779] RSP: 002b:000000c005a4f650 EFLAGS: 00000206 ORIG_RAX: 0000000000000000
[11632967.777780] RAX: ffffffffffffffda RBX: 000000c000060800 RCX: 00000000022d3d3b
[11632967.777780] RDX: 0000000000001000 RSI: 000000c004b00000 RDI: 00000000000000a3
[11632967.777780] RBP: 000000c005a4f6a0 R08: 0000000000000001 R09: 0000000000000002
[11632967.777780] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[11632967.777781] R13: 0000000000000002 R14: 0000000000000002 R15: 0000000000000002
[11632967.777781] Kernel Offset: 0x13000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)


[原因定位]:

从vmcore的分析情况来看,内核疑似存在死锁风险。
1 疑似死锁代码
1.1  持锁顺序ioc->lock, q->spinlock
static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
                             size_t nbytes, loff_t off)
{
    。。。。。。
        spin_lock_irq(&ioc->lock);

        if (enable) {
                blk_stat_enable_accounting(ioc->rqos.q);
                blk_queue_flag_set(QUEUE_FLAG_RQ_ALLOC_TIME, ioc->rqos.q);
                ioc->enabled = true;
        } else {
                blk_queue_flag_clear(QUEUE_FLAG_RQ_ALLOC_TIME, ioc->rqos.q);
                ioc->enabled = false;
        }

        if (user) {
                memcpy(ioc->params.qos, qos, sizeof(qos));
                ioc->user_qos_params = true;
        } else {
                ioc->user_qos_params = false;
        }

        ioc_refresh_params(ioc, true);
        spin_unlock_irq(&ioc->lock);
}
//注意:高版本5.10这个函数已经将queue_lock去掉
void blk_queue_flag_clear(unsigned int flag, struct request_queue *q)
{               
        unsigned long flags; 
        
        spin_lock_irqsave(q->queue_lock, flags);
        queue_flag_clear(flag, q);
        spin_unlock_irqrestore(q->queue_lock, flags);
}         

1.2 持锁顺序q->spinlock, ioc->lock
static inline bool blkcg_bio_issue_check(struct request_queue *q,
                                         struct bio *bio)
{
        struct blkcg *blkcg;
        struct blkcg_gq *blkg;
        bool throtl = false;
    。。。。。。

        blkg = blkg_lookup(blkcg, q);
        if (unlikely(!blkg)) {
                spin_lock_irq(q->queue_lock);
                blkg = blkg_lookup_create(blkcg, q);
                if (IS_ERR(blkg))
                        blkg = NULL;
                spin_unlock_irq(q->queue_lock);
        }
    。。。。。。
}

struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
                                    struct request_queue *q)
{
。。。。。。
        while (true) {
                struct blkcg *pos = blkcg;
                struct blkcg *parent = blkcg_parent(blkcg);

                while (parent && !__blkg_lookup(parent, q, false)) {
                        pos = parent;
                        parent = blkcg_parent(parent);
                }

                blkg = blkg_create(pos, q, NULL);
                if (pos == blkcg || IS_ERR(blkg))
                        return blkg;
        }
}

static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
                                    struct request_queue *q,
                                    struct blkcg_gq *new_blkg)
{
。。。。。。
        /* invoke per-policy init */
        for (i = 0; i < BLKCG_MAX_POLS; i++) {
                struct blkcg_policy *pol = blkcg_policy[i];

                if (blkg->pd[i] && pol->pd_init_fn)
                        pol->pd_init_fn(blkg->pd[i]);    //ioc_pd_init
        }
。。。。。
}

static void ioc_pd_init(struct blkg_policy_data *pd)
{
        struct ioc_gq *iocg = pd_to_iocg(pd);
        struct blkcg_gq *blkg = pd_to_blkg(&iocg->pd);
        struct ioc *ioc = q_to_ioc(blkg->q);
        struct ioc_now now;
        struct blkcg_gq *tblkg;
        unsigned long flags;

        ioc_now(ioc, &now);
    。。。。。。。
        spin_lock_irqsave(&ioc->lock, flags);
        weight_updated(iocg);
        spin_unlock_irqrestore(&ioc->lock, flags);
}

2 疑似死锁的堆栈证据

crash> bt -c 71
PID: 36365  TASK: ffff94dcf971a080  CPU: 71  COMMAND: "slo-agent"
 #0 [fffffe0000c3ce60] crash_nmi_callback at ffffffff940589f3
 #1 [fffffe0000c3ce68] nmi_handle at ffffffff9402b8fe
 #2 [fffffe0000c3ceb8] default_do_nmi at ffffffff9402be4e
 #3 [fffffe0000c3ced0] do_nmi at ffffffff9402bf87
 #4 [fffffe0000c3cef0] end_repeat_nmi at ffffffff94a0162d
    [exception RIP: queued_spin_lock_slowpath+93]
    RIP: ffffffff94100c5d  RSP: ffffa41d37823d58  RFLAGS: 00000002
    RAX: 0000000000cc0101  RBX: 0000000000000046  RCX: 0000000000b30a97
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: ffff94de9e1b0bcc
    RBP: 000000000000001d   R8: 0000000000027040   R9: ffffffff9449d59c
    R10: ffff94aeb00e7040  R11: fffff18bfe89b200  R12: ffffffffffffffea
    R13: 0000000000000000  R14: 0000000000000000  R15: ffff94ae63383428
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
 #5 [ffffa41d37823d58] queued_spin_lock_slowpath at ffffffff94100c5d
 #6 [ffffa41d37823d58] _raw_spin_lock_irqsave at ffffffff948e8172
 #7 [ffffa41d37823d68] blk_queue_flag_clear at ffffffff944565d8
 #8 [ffffa41d37823d80] ioc_qos_write at ffffffff944843f6    //先持有spin_lock_irq(&ioc->lock),然后等spin_lock_irqsave(q->queue_lock, flags);
 #9 [ffffa41d37823e68] cgroup_file_write at ffffffff941513d3
#10 [ffffa41d37823e90] kernfs_fop_write at ffffffff9435a000
#11 [ffffa41d37823ec8] vfs_write at ffffffff942bd17d
#12 [ffffa41d37823f00] ksys_write at ffffffff942bd3fa
#13 [ffffa41d37823f38] do_syscall_64 at ffffffff94003e4b
#14 [ffffa41d37823f50] entry_SYSCALL_64_after_hwframe at ffffffff94a0009c
    RIP: 00000000004cb9fb  RSP: 000000c002263328  RFLAGS: 00000212
    RAX: ffffffffffffffda  RBX: 000000c00004c800  RCX: 00000000004cb9fb
    RDX: 000000000000000d  RSI: 000000c0022635c8  RDI: 0000000000000056
    RBP: 000000c002263378   R8: 000000c002263401   R9: 0000000000000004
    R10: 00007fc9d1805d00  R11: 0000000000000212  R12: 00000000000000f2
    R13: 0000000000000000  R14: 0000000001e7dfaa  R15: 0000000000000000
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b

crash> bt -c 78
PID: 74959  TASK: ffff94874adc2080  CPU: 78  COMMAND: "kworker/u225:7"
 #0 [fffffe0000d70e60] crash_nmi_callback at ffffffff940589f3
 #1 [fffffe0000d70e68] nmi_handle at ffffffff9402b8fe
 #2 [fffffe0000d70eb8] default_do_nmi at ffffffff9402be4e
 #3 [fffffe0000d70ed0] do_nmi at ffffffff9402bf87
 #4 [fffffe0000d70ef0] end_repeat_nmi at ffffffff94a0162d
    [exception RIP: queued_spin_lock_slowpath+93]
    RIP: ffffffff94100c5d  RSP: ffffa41d268f7788  RFLAGS: 00000002
    RAX: 0000000000000101  RBX: 0000000000000086  RCX: 0000000000000002
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: ffff94ae633834c0
    RBP: 0000000000000000   R8: 0000000000000001   R9: ffff94ae67857c00
    R10: ffff94aeaf007000  R11: 0000000000000010  R12: ffff94ae67857c00
    R13: ffff94dd5386c200  R14: ffff94de9e1b0a08  R15: 0000000000000008
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
 #5 [ffffa41d268f7788] queued_spin_lock_slowpath at ffffffff94100c5d
 #6 [ffffa41d268f7788] _raw_spin_lock_irqsave at ffffffff948e8172    //等待spin_lock_irqsave(&ioc->lock, flags);
 #7 [ffffa41d268f7798] ioc_pd_init at ffffffff94484617
 #8 [ffffa41d268f77d8] blkg_create at ffffffff94478fda    //blkcg_bio_issue_check()会持有spin_lock_irq(q->queue_lock)
 #9 [ffffa41d268f7818] blkg_lookup_create at ffffffff9447af67
#10 [ffffa41d268f7840] generic_make_request_checks at ffffffff94457a86
#11 [ffffa41d268f78d8] generic_make_request at ffffffff944598ad
#12 [ffffa41d268f7920] __split_and_process_non_flush at ffffffffc058a453 [dm_mod]
#13 [ffffa41d268f7968] dm_process_bio at ffffffffc058a761 [dm_mod]
#14 [ffffa41d268f79d0] dm_make_request at ffffffffc058ab6e [dm_mod]
#15 [ffffa41d268f79f8] generic_make_request at ffffffff944599cf
#16 [ffffa41d268f7a40] submit_bio at ffffffff94459c4e
#17 [ffffa41d268f7aa0] ext4_io_submit at ffffffff9439b7f7
#18 [ffffa41d268f7ac0] ext4_writepages at ffffffff943823da
#19 [ffffa41d268f7c18] do_writepages at ffffffff942125da
#20 [ffffa41d268f7c30] __writeback_single_inode at ffffffff942eb6b9
#21 [ffffa41d268f7c68] writeback_sb_inodes at ffffffff942ec0a6
#22 [ffffa41d268f7d48] wb_writeback at ffffffff942ec481
#23 [ffffa41d268f7df0] wb_workfn at ffffffff942ecc53
#24 [ffffa41d268f7e78] process_one_work at ffffffff940bd11b
#25 [ffffa41d268f7eb8] worker_thread at ffffffff940bd379
#26 [ffffa41d268f7f10] kthread at ffffffff940c30d8
#27 [ffffa41d268f7f50] ret_from_fork at ffffffff94a0021f
Comment 2 小龙 admin 2024-02-06 11:47:45 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/2760
Comment 3 Joseph Qi alibaba_cloud_group 2024-02-08 20:26:40 UTC
(In reply to 小龙 from comment #2)
> The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/2760

merged