Bug 6182 - [ANCK-5.10-16][x86_64][nightly] D2015_native_vsyscall用例fail,报错TFAIL: gettimeofday test failed in 5u container,3 sub processes failed
Summary: [ANCK-5.10-16][x86_64][nightly] D2015_native_vsyscall用例fail,报错TFAIL: gettimeo...
Status: RESOLVED WORKSFORME
Alias: None
Product: Antest
Classification: Infrastructures
Component: 测试用例 (show other bugs) 测试用例
Version: unspecified
Hardware: x86_64 Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: yunmeng365524
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-17 17:22 UTC by shanxifanshi
Modified: 2023-10-18 14:53 UTC (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description shanxifanshi alibaba_cloud_group 2023-08-17 17:22:42 UTC
[缺陷描述]:
D2015_native_vsyscall用例fail,报错TFAIL: gettimeofday test failed in 5u container,3 sub processes failed;目前发现在F51机型fail,原来的N49机型测试是pass的


测试日志:
<<<test_start>>>
tag=D2015_native_vsyscall stime=1692068822
cmdline="D2015_native_vsyscall.sh"
contacts=""
analysis=exit
<<<test_output>>>
D2015_native_vsyscall 1 TINFO: Install or update docker ...
D2015_native_vsyscall 1 TINFO: docker container a749b13a1c4e2544ddcf4323055d13f1483051f9fbde5c2fbbc886ce7e6b87ec
D2015_native_vsyscall 1 TINFO: run gettimeofday test in container
Compile do_gettimeofday.c with -static
Start 96 sub processes to run do_gettimeofday
3 sub processes failed
D2015_native_vsyscall 1 TFAIL: gettimeofday test failed in 5u container
D2015_native_vsyscall 2 TINFO: Stop and remove container a749b13a1c4e2544ddcf4323055d13f1483051f9fbde5c2fbbc886ce7e6b87ec
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
time="2023-08-15T11:08:32+08:00" level=warning msg="StopSignal  failed to stop container sweet_galois in 10 seconds, resorting to SIGKILL"
a749b13a1c4e2544ddcf4323055d13f1483051f9fbde5c2fbbc886ce7e6b87ec
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
a749b13a1c4e2544ddcf4323055d13f1483051f9fbde5c2fbbc886ce7e6b87ec

Summary:
passed   0
failed   1
skipped  0
warnings 0
<<<execution_status>>>
initiation_status="ok"
duration=90 termination_type=exited termination_id=1 corefile=no
cutime=3265 cstime=219
<<<test_end>>>
复现环境:
anck 5.10 x86物理机 F51机型

复现概率:
必现

内核信息:
# uname -r
5.10.134-77.git.a73b26af8.an8.x86_64


操作系统信息:
# cat /etc/os-release
NAME="Anolis OS"
VERSION="8.8"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="8.8"
PLATFORM_ID="platform:an8"
PRETTY_NAME="Anolis OS 8.8"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"

cpu信息:
# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              96
On-line CPU(s) list: 0-95
Thread(s) per core:  2
Core(s) per socket:  24
Socket(s):           2
NUMA node(s):        1
Vendor ID:           GenuineIntel
BIOS Vendor ID:      Intel(R) Corporation
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
BIOS Model name:     Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
Stepping:            4
CPU MHz:             999.644
CPU max MHz:         3100.0000
CPU min MHz:         1000.0000
BogoMIPS:            5000.00
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            33792K
NUMA node0 CPU(s):   0-95
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear flush_l1d arch_capabilities

内存信息:
# free -h
              total        used        free      shared  buff/cache   available
Mem:          503Gi       3.1Gi       499Gi       2.0Mi       639Mi       497Gi
Swap:            0B          0B          0B

内核参数:
# cat /proc/cmdline
BOOT_IMAGE=(hd0,msdos1)/boot/vmlinuz-5.10.134-77.git.a73b26af8.an8.x86_64 root=UUID=a5b37697-06a2-41c2-9345-b1aced9b8a70 ro cryptomgr.notests rcupdate.rcu_cpu_stall_timeout=300 vring_force_dma_api rhgb quiet biosdevname=0 net.ifnames=0 console=tty0 console=ttyS0,115200n8 noibrs nvme_core.io_timeout=4294967295 nvme_core.admin_timeout=4294967295 cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M

[复现步骤]:
git clone git@gitlab-sp.alibaba-inc.com:alikernel/ltp.git
cd ltp
export BUILD_ALITESTS_ONLY=yes
export CFLAGS="${CFLAGS} -fcommon"  #  gcc 10 需要添加这个。
make autotools
./configure 
make && install
cd /opt/ltp
./runltp -f alitests -s D2015_native_vsyscall

[期望结果]:
用例pass

[实际结果]:
用例fail

[测试分析]:
手动简要复现步骤:
yum install -y podman-docker
docker pull reg.docker.alibaba-inc.com/ali/os:5u7

docker test目录和本地的D2015_native_vsyscall目录数据共享
container_id=`docker run -d --net host --volume alitests/testcases/data/D2015_native_vsyscall:/test  reg.docker.alibaba-inc.com/ali/os:5u7`

在docker中执行docker_5u_gettimeofday.sh进行测试
docker exec $container_id /test/docker_5u_gettimeofday.sh

用例首先会起一个docker容器,然后在容器起96个(取决于当前机器的cpu个数)do_gettimeofday进程,之后通过wait $pid判断进程是否会自动退出,预期进程能全部自动退出,实际上总会预留3个进程。
Comment 1 shanxifanshi alibaba_cloud_group 2023-10-16 13:51:31 UTC
an8 016内核,F51机型验证,该问题仍然存在

<<<test_start>>>
tag=D2015_native_vsyscall stime=1697435355
cmdline="D2015_native_vsyscall.sh"
contacts=""
analysis=exit
<<<test_output>>>
incrementing stop
D2015_native_vsyscall 1 TINFO: Install or update docker ...
Last metadata expiration check: 3:39:00 ago on Mon 16 Oct 2023 10:10:17 AM CST.
Package podman-docker-3:4.4.1-16.0.1.module+an8.8.0+11116+677fb6f4.noarch is already installed.
Dependencies resolved.
Nothing to do.
Complete!
D2015_native_vsyscall 1 TINFO: docker pull reg.docker.alibaba-inc.com/ali/os:5u7
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
Trying to pull reg.docker.alibaba-inc.com/ali/os:5u7...
Getting image source signatures
Copying blob 02ab59f4136f skipped: already exists
Copying blob 602e11d93b34 skipped: already exists
Copying blob 89ec547e2c8b skipped: already exists
Copying blob ff00c56a073b skipped: already exists
Copying blob 51cc571d8ceb skipped: already exists
Copying blob 8c458e452612 skipped: already exists
Copying blob d49175812ba8 skipped: already exists
Copying blob 846545e81a36 skipped: already exists
Copying blob bb8d77da8060 skipped: already exists
Copying blob 5fdfed67c76f skipped: already exists
Copying blob 835d45138c30 skipped: already exists
Copying blob e074daa2d968 skipped: already exists
Copying blob 639347476b33 skipped: already exists
Copying blob 4c3a66800d96 skipped: already exists
Copying blob a81d4905970e skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 909a6a5ae131 skipped: already exists
Copying blob caded7b9af53 skipped: already exists
Writing manifest to image destination
Storing signatures
51ae17e7a69b082139d2d81743e8ef47e1abdea11fcfcd43a8618fff36d6add2
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
D2015_native_vsyscall 1 TINFO: docker container 035d77feb496f22f02c3d1ae984b1e8b076391826eecaa2cd871ecf7018915d2
D2015_native_vsyscall 1 TINFO: run gettimeofday test in container
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
which: no gcc in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin)
Loaded plugins: branch, fastestmirror, security
Determining fastest mirrors
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package gcc.x86_64 0:4.1.2-51.2.alios5 set to be updated
--> Processing Dependency: cpp = 4.1.2-51.2.alios5 for package: gcc
--> Processing Dependency: glibc-devel >= 2.2.90-12 for package: gcc
--> Running transaction check
---> Package cpp.x86_64 0:4.1.2-51.2.alios5 set to be updated
---> Package glibc-devel.x86_64 0:2.5-81.2 set to be updated
--> Processing Dependency: glibc-headers = 2.5-81.2 for package: glibc-devel
--> Processing Dependency: glibc-headers for package: glibc-devel
--> Running transaction check
---> Package glibc-headers.x86_64 0:2.5-81.2 set to be updated
--> Processing Dependency: kernel-headers >= 2.2.1 for package: glibc-headers
--> Processing Dependency: kernel-headers for package: glibc-headers
--> Running transaction check
---> Package kernel-headers.x86_64 0:2.6.18-274.el5 set to be updated
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package          Arch     Version               Repository                Size
================================================================================
Installing:
 gcc              x86_64   4.1.2-51.2.alios5     alios.5.base.x86_64      5.3 M
Installing for dependencies:
 cpp              x86_64   4.1.2-51.2.alios5     alios.5.base.x86_64      2.9 M
 glibc-devel      x86_64   2.5-81.2              alios.5.base.x86_64      2.4 M
 glibc-headers    x86_64   2.5-81.2              alios.5.base.x86_64      596 k
 kernel-headers   x86_64   2.6.18-274.el5        redhat.5u7.base.x86_64   1.3 M

Transaction Summary
================================================================================
Install       5 Package(s)
Upgrade       0 Package(s)

Total download size: 12 M
Downloading Packages:
--------------------------------------------------------------------------------
Total                                            33 MB/s |  12 MB     00:00
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing     : cpp                                                      1/5
  Installing     : kernel-headers                                           2/5
  Installing     : glibc-headers                                            3/5
  Installing     : glibc-devel                                              4/5
  Installing     : gcc                                                      5/5

Installed:
  gcc.x86_64 0:4.1.2-51.2.alios5

Dependency Installed:
  cpp.x86_64 0:4.1.2-51.2.alios5      glibc-devel.x86_64 0:2.5-81.2
  glibc-headers.x86_64 0:2.5-81.2     kernel-headers.x86_64 0:2.6.18-274.el5

Complete!
Compile do_gettimeofday.c with -static
Start 96 sub processes to run do_gettimeofday
3 sub processes failed
D2015_native_vsyscall 1 TFAIL: gettimeofday test failed in 5u container
D2015_native_vsyscall 2 TINFO: Stop and remove container 035d77feb496f22f02c3d1ae984b1e8b076391826eecaa2cd871ecf7018915d2
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
WARN[0010] StopSignal  failed to stop container hungry_pasteur in 10 seconds, resorting to SIGKILL
035d77feb496f22f02c3d1ae984b1e8b076391826eecaa2cd871ecf7018915d2
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
035d77feb496f22f02c3d1ae984b1e8b076391826eecaa2cd871ecf7018915d2

Summary:
passed   0
failed   1
skipped  0
warnings 0
<<<execution_status>>>
initiation_status="ok"
duration=56 termination_type=exited termination_id=1 corefile=no
cutime=372 cstime=49
<<<test_end>>>

# uname -r
5.10.134-16_rc1.an8.x86_64
Comment 2 yunmeng365524 2023-10-18 14:53:45 UTC
016版本在内部跟踪