Bug 3693 - [Anck 5.10 x86_64][社区nightly]框架执行失败,手动执行pass用例记录
Summary: [Anck 5.10 x86_64][社区nightly]框架执行失败,手动执行pass用例记录
Status: NEW
Alias: None
Product: Antest
Classification: Infrastructures
Component: 测试用例 (show other bugs) 测试用例
Version: unspecified
Hardware: x86_64 Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: shanxifanshi
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-01-10 18:35 UTC by shanxifanshi
Modified: 2023-02-16 11:28 UTC (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description shanxifanshi alibaba_cloud_group 2023-01-10 18:35:49 UTC
[缺陷描述]:
部分用例通过Tone框架会执行失败,但手动执行是pass的,在此做个记录

涉及的用例有:
netfilter.conntrack_vrf.sh
seccomp.seccomp_benchmark
seccomp.seccomp_bpf
net.udpgso_bench.sh
net.reuseport_bpf_numa
net.reuseport_bpf



复现环境:
anck 5.10 x86物理机

复现概率:
必现

内核信息:
# uname -r
5.10.134-268.git.53f303a6c3fa.an8.x86_64

操作系统信息:
# cat /etc/os-release
NAME="Anolis OS"
VERSION="8.6"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="8.6"
PLATFORM_ID="platform:an8"
PRETTY_NAME="Anolis OS 8.6"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"

cpu信息:
# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  2
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
BIOS Vendor ID:      Alibaba Cloud
CPU family:          6
Model:               106
Model name:          Intel(R) Xeon(R) Platinum 8369B CPU @ 2.70GHz
BIOS Model name:     pc-i440fx-2.1
Stepping:            6
CPU MHz:             2699.998
BogoMIPS:            5399.99
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           48K
L1i cache:           32K
L2 cache:            1280K
L3 cache:            49152K
NUMA node0 CPU(s):   0-3
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves wbnoinvd arat avx512vbmi pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm arch_capabilitie

内存信息:
# free -h
              total        used        free      shared  buff/cache   available
Mem:           15Gi       355Mi        11Gi       1.0Mi       3.6Gi        14Gi
Swap:            0B          0B          0B

[复现步骤]:
下载当前内核对应的kernel源码包
rpm -ivh xxx.src.rpm  默认安装到/root下
yum-builddep -y rpmbuild/SPECS/kernel.spec   自动安装前置依赖包,需要yum-utils
rpmbuild -bp ./rpmbuild/SPECS/kernel.spec   # 这个步骤会打相关的patch, 解压缩tar包,生成BUILD目录
cd rpmbuild/BUILD/kernel-xxx/linux-xxx/  

接下来就可以编译测试了, 以conntrack_vrf.sh为例
cd  /tools/testing/selftests/netfilter/
make

执行测试用例
./conntrack_vrf.sh

[期望结果]:
用例pass

[实际结果]:
用例fail
Comment 1 shanxifanshi alibaba_cloud_group 2023-01-10 18:40:46 UTC
还有这些用例:
net.xfrm_policy.sh
net.run_afpackettests
Comment 2 shanxifanshi alibaba_cloud_group 2023-01-11 16:54:27 UTC
xfstests xfs文件系统 generic/648用例也存在同样问题
Comment 3 shanxifanshi alibaba_cloud_group 2023-01-11 16:55:05 UTC
(In reply to shanxifanshi from comment #2)
> xfstests xfs文件系统 generic/648用例也存在同样问题

还有generic/313用例
Comment 4 shanxifanshi alibaba_cloud_group 2023-01-12 09:37:50 UTC
ltp下面read_all_proc用例也会概率因为超时而失败,单独执行一直都是pass的,再观察一下
Comment 5 yunhe123 alibaba_cloud_group 2023-02-16 11:14:08 UTC
[anck 5.10-aarch64][nightly]以下用例通过tone框架执行失败,环境重启后,手动执行pass,先记录下,后面定位到根因再修改:
kernel-selftests:
bpf.test_tunnel.sh
bpf.test_lwt_ip_encap.sh
bpf.test_xdping.sh
netfilter.nft_nat.sh

alitests:
ali_cpuacct_07
Comment 6 shanxifanshi alibaba_cloud_group 2023-02-16 11:28:29 UTC
(In reply to yunhe123 from comment #5)
> [anck 5.10-aarch64][nightly]以下用例通过tone框架执行失败,环境重启后,手动执行pass,先记录下,后面定位到根因再修改:
> kernel-selftests:
> bpf.test_tunnel.sh
> bpf.test_lwt_ip_encap.sh
> bpf.test_xdping.sh
> netfilter.nft_nat.sh
> 
> alitests:
> ali_cpuacct_07

---是社区的问题吗?如果不是社区问题的话,个人建议不要通过这个bug单跟踪手动测试pass的用例,否则版本测试时候这些用例真的fail了,容易漏掉问题;另外手动pass的用例也要尽可能的找找原因,如果能够找到框架运行时导致它fail的原因,比如其他用例未清理环境影响之类的,可以尝试优化一下。