[缺陷描述]: kernel-selftest下pstore_post_reboot_tests用例失败 复现环境: anck 5.10 x86物理机 复现概率: 必现 内核信息: # uname -r 5.10.134-320.git.04d8c84896c6.an8.x86_64 操作系统信息: # cat /etc/os-release NAME="Anolis OS" VERSION="8.8" ID="anolis" ID_LIKE="rhel fedora centos" VERSION_ID="8.8" PLATFORM_ID="platform:an8" PRETTY_NAME="Anolis OS 8.8" ANSI_COLOR="0;31" HOME_URL="https://openanolis.cn/" cpu信息: # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel BIOS Vendor ID: Alibaba Cloud CPU family: 6 Model: 106 Model name: Intel(R) Xeon(R) Platinum 8369B CPU @ 2.70GHz BIOS Model name: pc-i440fx-2.1 Stepping: 6 CPU MHz: 2699.998 BogoMIPS: 5399.99 Hypervisor vendor: KVM Virtualization type: full L1d cache: 48K L1i cache: 32K L2 cache: 1280K L3 cache: 49152K NUMA node0 CPU(s): 0-3 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves wbnoinvd arat avx512vbmi pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm arch_capabilities # free -h total used free shared buff/cache available Mem: 15Gi 226Mi 13Gi 1.0Mi 1.9Gi 14Gi Swap: 0B 0B 0B [复现步骤]: 下载当前内核对应的kernel源码包 rpm -ivh xxx.src.rpm 默认安装到/root下 yum-builddep -y rpmbuild/SPECS/kernel.spec 自动安装前置依赖包,需要yum-utils rpmbuild -bp ./rpmbuild/SPECS/kernel.spec # 这个步骤会打相关的patch, 解压缩tar包,生成BUILD目录 cd rpmbuild/BUILD/kernel-xxx/linux-xxx/ 接下来就可以编译测试了 cd /tools/testing/selftests/pstore make 执行测试用例 ./pstore_post_reboot_tests [期望结果]: 用例pass [实际结果]: 用例fail [原因分析]: 1. 该用例在2月23号之前都是skip的 2. 该用例在2月24号晚之后都是fail 3. 排查发现是因为缺失了这个目录/sys/module/pstore/parameters/backend,判定的fail,该目录在24号晚之前都是存在的。
近期内核代码有pstore相关的commit,可能是内核代码修改导致的,麻烦开发同学确认下是否预期行为https://gitee.com/anolis/cloud-kernel/pulls/1231
[Anolis8][Anck 5.10 aarch64][内部nightly]pstore_post_reboot_tests用例存在同样的问题,且pstore_tests用例也因为同样的原因失败: pstore_post_reboot_tests用例失败日志如下: # ./pstore_post_reboot_tests 3=== Pstore unit tests (pstore_post_reboot_tests) === UUID=d248ad17-54cb-450f-ae8e-772bc553a318 Checking pstore backend is registered ... cat: /sys/module/pstore/parameters/backend: No such file or directory FAIL backend= cmdline=BOOT_IMAGE=(hd1,gpt2)/boot/vmlinuz-5.10.134-568.git.eb96941d6.an8.aarch64 root=UUID=7418b0c7-2c7e-4f3b-a364-88735caf892a ro console=tty0 console=ttyS0,115200 rd.driver.pre=ahci cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M iommu.passthrough=1 iommu.strict=0 # ./pstore_tests用例失败日志如下: === Pstore unit tests (pstore_tests) === UUID=79bc7ac1-887b-4fc2-bb85-84f1101d01c1 Checking pstore backend is registered ... cat: /sys/module/pstore/parameters/backend: No such file or directory FAIL backend= cmdline=BOOT_IMAGE=(hd1,gpt2)/boot/vmlinuz-5.10.134-568.git.eb96941d6.an8.aarch64 root=UUID=7418b0c7-2c7e-4f3b-a364-88735caf892a ro console=tty0 console=ttyS0,115200 rd.driver.pre=ahci cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M iommu.passthrough=1 iommu.strict=0
这个问题在an8 5.10.134-14_rc1内核也是存在的。 # uname -r 5.10.134-14_rc1.an8.x86_64 测试日志: # selftests: pstore: pstore_post_reboot_tests # === Pstore unit tests (pstore_post_reboot_tests) === # UUID=6702a5a8-7055-49cd-adb6-f849398c8ccf # Checking pstore backend is registered ... cat: /sys/module/pstore/parameters/backend: No such file or directory # FAIL # backend= # cmdline=BOOT_IMAGE=(hd0,msdos1)/boot/vmlinuz-5.10.134-14_rc1.an8.x86_64 root=UUID=118262e8-70d9-4b7e-9ae0-fdd5c12355d2 ro cryptomgr.notests rcupdate.rcu_cpu_stall_timeout=300 vring_force_dma_api rhgb quiet biosdevname=0 net.ifnames=0 console=tty0 console=ttyS0,115200n8 noibrs nvme_core.io_timeout=4294967295 nvme_core.admin_timeout=4294967295 cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M not ok 2 selftests: pstore: pstore_post_reboot_tests # exit=1 # ll /sys/module/pstore/parameters/backend ls: cannot access '/sys/module/pstore/parameters/backend': No such file or directory
pstore 有了变更,感觉是目录变更导致,麻烦确认一下case需要同步适配,还是存在bug。
https://gitee.com/anolis/cloud-kernel/pulls/1231 增加了pstore多后端的支持,删除了backend的目录,后续会对selftest的case进行同步
Anolis 23 x86_64环境,社区nightly kernel-selftes测试pstore下case: pstore_tests与pstore_post_reboot_tests也存在同样的问题 ------------------ [root@qibo-anolis23-nightly-func-x86-1 pstore]# ./pstore_post_reboot_tests === Pstore unit tests (pstore_post_reboot_tests) === UUID=b7df0ab0-1f16-47eb-8b8b-84f5ab84beef Checking pstore backend is registered ... cat: /sys/module/pstore/parameters/backend: No such file or directory #问题点 FAIL backend= cmdline=BOOT_IMAGE=(hd0,msdos1)/boot/vmlinuz-5.10.134-1.git.2ed1510fd4be.an23.x86_64 root=UUID=ece72b7f-465b-433d-8b3b-e5fa53a04642 ro rhgb cryptomgr.notests rcupdate.rcu_cpu_stall_timeout=300 quiet biosdevname=0 net.ifnames=0 console=tty0 console=ttyS0,115200n8 noibrs nvme_core.io_timeout=4294967295 nvme_core.admin_timeout=4294967295 cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M [root@qibo-anolis23-nightly-func-x86-1 pstore]# ./pstore_tests === Pstore unit tests (pstore_tests) === UUID=43b3d4e9-cd80-4fec-9de5-cca79f6a0000 Checking pstore backend is registered ... cat: /sys/module/pstore/parameters/backend: No such file or directory #问题点 FAIL backend= cmdline=BOOT_IMAGE=(hd0,msdos1)/boot/vmlinuz-5.10.134-1.git.2ed1510fd4be.an23.x86_64 root=UUID=ece72b7f-465b-433d-8b3b-e5fa53a04642 ro rhgb cryptomgr.notests rcupdate.rcu_cpu_stall_timeout=300 quiet biosdevname=0 net.ifnames=0 console=tty0 console=ttyS0,115200n8 noibrs nvme_core.io_timeout=4294967295 nvme_core.admin_timeout=4294967295 cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M [root@qibo-anolis23-nightly-func-x86-1 pstore]# [root@qibo-anolis23-nightly-func-x86-1 pstore]# uname -r 5.10.134-1.git.2ed1510fd4be.an23.x86_64 [root@qibo-anolis23-nightly-func-x86-1 pstore]# cat /etc/anolis-release Anolis OS release 23
https://gitee.com/anolis/cloud-kernel/pulls/1388 已合入
开发修改后,最新的nightly测试结果,该用例skip,与之前保持一致,问题解决,bug关闭 # uname -r 5.10.134-334.git.23b31a882ca6.an8.x86_64 # ./pstore_post_reboot_tests === Pstore unit tests (pstore_post_reboot_tests) === UUID=e6fa6409-d7a5-47a6-8cef-d0cb7a182af5 cmdline=BOOT_IMAGE=(hd0,msdos1)/boot/vmlinuz-5.10.134-334.git.23b31a882ca6.an8.x86_64 root=UUID=d9790f8b-1457-4091-889a-109a4f446404 ro cryptomgr.notests rcupdate.rcu_cpu_stall_timeout=300 vring_force_dma_api rhgb quiet biosdevname=0 net.ifnames=0 console=tty0 console=ttyS0,115200n8 noibrs nvme_core.io_timeout=4294967295 nvme_core.admin_timeout=4294967295 cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M pstore_crash_test has not been executed yet. we skip further tests.