Bug 399 - kernelselfs测试套fib_rule_tests.sh触发Kernel panic - not syncing: softlockup: hung tasks
Summary: kernelselfs测试套fib_rule_tests.sh触发Kernel panic - not syncing: softlockup: hung...
Status: CONFIRMED
Alias: None
Product: ANCK 4.19 Dev
Classification: ANCK
Component: general/others (show other bugs) general/others
Version: unspecified
Hardware: x86_64 Linux
: P2-High S2-major
Target Milestone: ---
Assignee: Jacob
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-01-26 11:26 UTC by kangjiangbo
Modified: 2022-07-14 15:54 UTC (History)
2 users (show)

See Also:


Attachments
vmcore-dmesg (2.12 MB, text/plain)
2022-01-26 11:29 UTC, kangjiangbo
Details

Note You need to log in before you can comment on or make changes to this bug.
Description kangjiangbo 2022-01-26 11:26:58 UTC
Description of problem:
4.19 x86内核kernelselfs测试套fib_rule_tests.sh触发crash产生vmcore:
Kernel panic - not syncing: softlockup: hung tasks
[复现概率]:3/3  但手动执行用例未复现


Version-Release number of selected component (if applicable):

# uname -a
Linux i22e11409.eu95sqa 4.19.91-211.git.396b610b8.an8.x86_64 #1 SMP Tue Jan 25 13:29:31 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
[root@i22e11409 1]# cat /etc/os-release
NAME="Anolis OS"
VERSION="8.2"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="8.2"
PLATFORM_ID="platform:an8"
PRETTY_NAME="Anolis OS 8.2"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"



How reproducible:


Steps to Reproduce:
1.下载kernel源码包
2.cd tools/testing/selftests/net
3.fib_rule_tests.sh

Actual results:
# crash -x /usr/lib/debug/usr/lib/modules/4.19.91-211.git.396b610b8.an8.x86_64/vmlinux /var/crash/127.0.0.1-2022-01-26-04\:43\:48/vmcore

crash 7.2.7-3.el8.1
Copyright (C) 2002-2020  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.

GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

WARNING: kernel relocated [848MB]: patching 99399 gdb minimal_symbol values

      KERNEL: /usr/lib/debug/usr/lib/modules/4.19.91-211.git.396b610b8.an8.x86_64/vmlinux
    DUMPFILE: /var/crash/127.0.0.1-2022-01-26-04:43:48/vmcore  [PARTIAL DUMP]
        CPUS: 96
        DATE: Wed Jan 26 04:43:24 2022
      UPTIME: 06:33:03
LOAD AVERAGE: 80.47, 61.11, 65.33
       TASKS: 1334
    NODENAME: i22e11409.eu95sqa
     RELEASE: 4.19.91-211.git.396b610b8.an8.x86_64
     VERSION: #1 SMP Tue Jan 25 13:29:31 UTC 2022
     MACHINE: x86_64  (2500 Mhz)
      MEMORY: 767.4 GB
       PANIC: "Kernel panic - not syncing: softlockup: hung tasks"
         PID: 1295287
     COMMAND: "ping"
        TASK: ffff91852fea4180  [THREAD_INFO: ffff91852fea4180]
         CPU: 52
       STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 1295287  TASK: ffff91852fea4180  CPU: 52  COMMAND: "ping"
 #0 [ffff9185ef903d68] machine_kexec at ffffffffb606480a
 #1 [ffff9185ef903db8] __crash_kexec at ffffffffb61478ea
 #2 [ffff9185ef903e78] panic at ffffffffb60a1195
 #3 [ffff9185ef903f20] __hrtimer_run_queues at ffffffffb6129ff0
 #4 [ffff9185ef903f78] hrtimer_interrupt at ffffffffb612aae0
 #5 [ffff9185ef903fd8] smp_apic_timer_interrupt at ffffffffb6a025aa
 #6 [ffff9185ef903ff0] apic_timer_interrupt at ffffffffb6a01b0f
--- <IRQ stack> ---
 #7 [ffffa28119f17d98] apic_timer_interrupt at ffffffffb6a01b0f
    [exception RIP: queued_write_lock_slowpath+74]
    RIP: ffffffffb60fe37a  RSP: ffffa28119f17e40  RFLAGS: 00000206
    RAX: 0000000000002100  RBX: ffffffffba6f1da0  RCX: 0000000000000100
    RDX: 00000000000000ff  RSI: 0000000000d40000  RDI: ffffffffba6f1da4
    RBP: ffffffffba6f1da0   R8: 0000000000cc0000   R9: ffff9184b50bb7b0
    R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000001
    R13: ffffffffba6f1db0  R14: ffff91857db4b300  R15: 0000000000000000
    ORIG_RAX: ffffffffffffff13  CS: 0010  SS: 0018
 #8 [ffffa28119f17e48] raw_hash_sk at ffffffffb6808c8b
 #9 [ffffa28119f17e70] inet_create at ffffffffb681b2b6
#10 [ffffa28119f17eb0] __sock_create at ffffffffb675619f
#11 [ffffa28119f17ef8] __sys_socket at ffffffffb67570a5
#12 [ffffa28119f17f28] __x64_sys_socket at ffffffffb6757136
#13 [ffffa28119f17f30] do_syscall_64 at ffffffffb60040ff
#14 [ffffa28119f17f50] entry_SYSCALL_64_after_hwframe at ffffffffb6a00085
    RIP: 00007fa8fd99350b  RSP: 00007fff5cdaf888  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000000002  RCX: 00007fa8fd99350b
    RDX: 0000000000000001  RSI: 0000000000000003  RDI: 0000000000000002
    RBP: 00007fff5cdaf900   R8: 0000000000000001   R9: 0000000000000001
    R10: 00007fff5cdaf8b0  R11: 0000000000000246  R12: 0000000000000002
    R13: 0000000000000001  R14: 00007fff5cdaf904  R15: 00007fa8fed0e400
    ORIG_RAX: 0000000000000029  CS: 0033  SS: 002b


Expected results:


Additional info:
# free -mh
              total        used        free      shared  buff/cache   available
Mem:          755Gi       4.1Gi       750Gi        98Mi       575Mi       747Gi
Swap:         2.0Gi          0B       2.0Gi
[root@i22e11409 1]# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              96
On-line CPU(s) list: 0-95
Thread(s) per core:  2
Core(s) per socket:  24
Socket(s):           2
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
Stepping:            4
CPU MHz:             1646.995
CPU max MHz:         3100.0000
CPU min MHz:         1000.0000
BogoMIPS:            5000.00
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            33792K
NUMA node0 CPU(s):   0-95
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear flush_l1d
Comment 1 kangjiangbo 2022-01-26 11:29:35 UTC
Created attachment 142 [details]
vmcore-dmesg
Comment 2 kangjiangbo 2022-02-08 14:28:55 UTC
kernelselfs测试套udpgso.sh中ipv4 msg_more也可能触发此问题,但手动执行无法复现