Bug 4257 - [Anck 5.10 nightly/ANCK-5.10-14-rc1][Anolis8][x86_64] kernel-selftest下pstore_post_reboot_tests用例失败
Summary: [Anck 5.10 nightly/ANCK-5.10-14-rc1][Anolis8][x86_64] kernel-selftest下pstore_...
Status: CLOSED FIXED
Alias: None
Product: Antest
Classification: Infrastructures
Component: 测试用例 (show other bugs) 测试用例
Version: unspecified
Hardware: x86_64 Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: xiangzao
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-28 16:08 UTC by shanxifanshi
Modified: 2023-07-25 15:29 UTC (History)
9 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description shanxifanshi alibaba_cloud_group 2023-02-28 16:08:53 UTC
[缺陷描述]:
kernel-selftest下pstore_post_reboot_tests用例失败

复现环境:
anck 5.10 x86物理机

复现概率:
必现

内核信息:
# uname -r
5.10.134-320.git.04d8c84896c6.an8.x86_64

操作系统信息:
# cat /etc/os-release
NAME="Anolis OS"
VERSION="8.8"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="8.8"
PLATFORM_ID="platform:an8"
PRETTY_NAME="Anolis OS 8.8"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"

cpu信息:
# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  2
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
BIOS Vendor ID:      Alibaba Cloud
CPU family:          6
Model:               106
Model name:          Intel(R) Xeon(R) Platinum 8369B CPU @ 2.70GHz
BIOS Model name:     pc-i440fx-2.1
Stepping:            6
CPU MHz:             2699.998
BogoMIPS:            5399.99
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           48K
L1i cache:           32K
L2 cache:            1280K
L3 cache:            49152K
NUMA node0 CPU(s):   0-3
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves wbnoinvd arat avx512vbmi pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm arch_capabilities

# free -h
              total        used        free      shared  buff/cache   available
Mem:           15Gi       226Mi        13Gi       1.0Mi       1.9Gi        14Gi
Swap:            0B          0B          0B

[复现步骤]:
下载当前内核对应的kernel源码包
rpm -ivh xxx.src.rpm  默认安装到/root下
yum-builddep -y rpmbuild/SPECS/kernel.spec   自动安装前置依赖包,需要yum-utils
rpmbuild -bp ./rpmbuild/SPECS/kernel.spec   # 这个步骤会打相关的patch, 解压缩tar包,生成BUILD目录
cd rpmbuild/BUILD/kernel-xxx/linux-xxx/  

接下来就可以编译测试了
cd  /tools/testing/selftests/pstore
make

执行测试用例
./pstore_post_reboot_tests

[期望结果]:
用例pass

[实际结果]:
用例fail

[原因分析]:
1. 该用例在2月23号之前都是skip的
2. 该用例在2月24号晚之后都是fail
3. 排查发现是因为缺失了这个目录/sys/module/pstore/parameters/backend,判定的fail,该目录在24号晚之前都是存在的。
Comment 1 shanxifanshi alibaba_cloud_group 2023-02-28 16:13:03 UTC
近期内核代码有pstore相关的commit,可能是内核代码修改导致的,麻烦开发同学确认下是否预期行为https://gitee.com/anolis/cloud-kernel/pulls/1231
Comment 2 yunhe123 alibaba_cloud_group 2023-02-28 17:10:31 UTC
[Anolis8][Anck 5.10 aarch64][内部nightly]pstore_post_reboot_tests用例存在同样的问题,且pstore_tests用例也因为同样的原因失败:

pstore_post_reboot_tests用例失败日志如下:
# ./pstore_post_reboot_tests
3=== Pstore unit tests (pstore_post_reboot_tests) ===
UUID=d248ad17-54cb-450f-ae8e-772bc553a318
Checking pstore backend is registered ... cat: /sys/module/pstore/parameters/backend: No such file or directory
FAIL
        backend=
        cmdline=BOOT_IMAGE=(hd1,gpt2)/boot/vmlinuz-5.10.134-568.git.eb96941d6.an8.aarch64 root=UUID=7418b0c7-2c7e-4f3b-a364-88735caf892a ro console=tty0 console=ttyS0,115200 rd.driver.pre=ahci cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M iommu.passthrough=1 iommu.strict=0


# ./pstore_tests用例失败日志如下:
=== Pstore unit tests (pstore_tests) ===
UUID=79bc7ac1-887b-4fc2-bb85-84f1101d01c1
Checking pstore backend is registered ... cat: /sys/module/pstore/parameters/backend: No such file or directory
FAIL
        backend=
        cmdline=BOOT_IMAGE=(hd1,gpt2)/boot/vmlinuz-5.10.134-568.git.eb96941d6.an8.aarch64 root=UUID=7418b0c7-2c7e-4f3b-a364-88735caf892a ro console=tty0 console=ttyS0,115200 rd.driver.pre=ahci cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M iommu.passthrough=1 iommu.strict=0
Comment 3 shanxifanshi alibaba_cloud_group 2023-03-02 10:39:53 UTC
这个问题在an8 5.10.134-14_rc1内核也是存在的。

# uname -r
5.10.134-14_rc1.an8.x86_64

测试日志:
# selftests: pstore: pstore_post_reboot_tests
# === Pstore unit tests (pstore_post_reboot_tests) ===
# UUID=6702a5a8-7055-49cd-adb6-f849398c8ccf
# Checking pstore backend is registered ... cat: /sys/module/pstore/parameters/backend: No such file or directory
# FAIL
# 	backend=
# 	cmdline=BOOT_IMAGE=(hd0,msdos1)/boot/vmlinuz-5.10.134-14_rc1.an8.x86_64 root=UUID=118262e8-70d9-4b7e-9ae0-fdd5c12355d2 ro cryptomgr.notests rcupdate.rcu_cpu_stall_timeout=300 vring_force_dma_api rhgb quiet biosdevname=0 net.ifnames=0 console=tty0 console=ttyS0,115200n8 noibrs nvme_core.io_timeout=4294967295 nvme_core.admin_timeout=4294967295 cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M
not ok 2 selftests: pstore: pstore_post_reboot_tests # exit=1

# ll /sys/module/pstore/parameters/backend
ls: cannot access '/sys/module/pstore/parameters/backend': No such file or directory
Comment 4 yunmeng365524 2023-03-04 21:31:09 UTC
pstore 有了变更,感觉是目录变更导致,麻烦确认一下case需要同步适配,还是存在bug。
Comment 5 xiangzao alibaba_cloud_group 2023-03-06 10:00:02 UTC
https://gitee.com/anolis/cloud-kernel/pulls/1231
增加了pstore多后端的支持,删除了backend的目录,后续会对selftest的case进行同步
Comment 6 anolislw alibaba_cloud_group 2023-03-06 13:07:26 UTC
Anolis 23 x86_64环境,社区nightly kernel-selftes测试pstore下case: pstore_tests与pstore_post_reboot_tests也存在同样的问题
------------------
[root@qibo-anolis23-nightly-func-x86-1 pstore]# ./pstore_post_reboot_tests
=== Pstore unit tests (pstore_post_reboot_tests) ===
UUID=b7df0ab0-1f16-47eb-8b8b-84f5ab84beef
Checking pstore backend is registered ... cat: /sys/module/pstore/parameters/backend: No such file or directory   #问题点
FAIL
        backend=
        cmdline=BOOT_IMAGE=(hd0,msdos1)/boot/vmlinuz-5.10.134-1.git.2ed1510fd4be.an23.x86_64 root=UUID=ece72b7f-465b-433d-8b3b-e5fa53a04642 ro rhgb cryptomgr.notests rcupdate.rcu_cpu_stall_timeout=300 quiet biosdevname=0 net.ifnames=0 console=tty0 console=ttyS0,115200n8 noibrs nvme_core.io_timeout=4294967295 nvme_core.admin_timeout=4294967295 cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M
[root@qibo-anolis23-nightly-func-x86-1 pstore]# ./pstore_tests
=== Pstore unit tests (pstore_tests) ===
UUID=43b3d4e9-cd80-4fec-9de5-cca79f6a0000
Checking pstore backend is registered ... cat: /sys/module/pstore/parameters/backend: No such file or directory #问题点
FAIL
        backend=
        cmdline=BOOT_IMAGE=(hd0,msdos1)/boot/vmlinuz-5.10.134-1.git.2ed1510fd4be.an23.x86_64 root=UUID=ece72b7f-465b-433d-8b3b-e5fa53a04642 ro rhgb cryptomgr.notests rcupdate.rcu_cpu_stall_timeout=300 quiet biosdevname=0 net.ifnames=0 console=tty0 console=ttyS0,115200n8 noibrs nvme_core.io_timeout=4294967295 nvme_core.admin_timeout=4294967295 cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M
[root@qibo-anolis23-nightly-func-x86-1 pstore]#
[root@qibo-anolis23-nightly-func-x86-1 pstore]# uname -r
5.10.134-1.git.2ed1510fd4be.an23.x86_64
[root@qibo-anolis23-nightly-func-x86-1 pstore]# cat /etc/anolis-release
Anolis OS release 23
Comment 7 xiangzao alibaba_cloud_group 2023-03-13 08:51:42 UTC
https://gitee.com/anolis/cloud-kernel/pulls/1388
已合入
Comment 8 shanxifanshi alibaba_cloud_group 2023-03-13 10:35:43 UTC
开发修改后,最新的nightly测试结果,该用例skip,与之前保持一致,问题解决,bug关闭

# uname -r
5.10.134-334.git.23b31a882ca6.an8.x86_64

# ./pstore_post_reboot_tests
=== Pstore unit tests (pstore_post_reboot_tests) ===
UUID=e6fa6409-d7a5-47a6-8cef-d0cb7a182af5
        cmdline=BOOT_IMAGE=(hd0,msdos1)/boot/vmlinuz-5.10.134-334.git.23b31a882ca6.an8.x86_64 root=UUID=d9790f8b-1457-4091-889a-109a4f446404 ro cryptomgr.notests rcupdate.rcu_cpu_stall_timeout=300 vring_force_dma_api rhgb quiet biosdevname=0 net.ifnames=0 console=tty0 console=ttyS0,115200n8 noibrs nvme_core.io_timeout=4294967295 nvme_core.admin_timeout=4294967295 cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M
pstore_crash_test has not been executed yet. we skip further tests.