[缺陷描述]: ltp测试 用例controllers/cpuacct_1_1执行过程所使用的subgroup中cpuacct.usage值不符合预期 测试日志如下: <<<test_start>>> tag=cpuacct_1_1 stime=1741060128 cmdline="cpuacct.sh 1 1" contacts="" analysis=exit <<<test_output>>> cpuacct 1 TINFO: Running: cpuacct.sh 1 1 cpuacct 1 TINFO: timeout per run is 0h 5m 0s tst_pid.c:84: TINFO: Cannot read session user limits from '/sys/fs/cgroup/user.slice/user-1377975.slice/pids.max' tst_pid.c:94: TINFO: Found limit of processes 1648358 (from /sys/fs/cgroup/pids/user.slice/user-1377975.slice/pids.max) cpuacct 1 TINFO: task limit fulfilled (approximate need 1, limit 1646945) cpuacct 1 TINFO: cpuacct: /sys/fs/cgroup/cpuset,cpu,cpuacct cpuacct 1 TINFO: Creating 1 subgroups each with 1 processes cpuacct 1 TFAIL: cpuacct.usage is not equal to 0 for 1 subgroups cpuacct 1 TPASS: cpuacct.usage equal to subgroup*/cpuacct.usage cpuacct 2 TINFO: removing created directories Summary: passed 1 failed 1 broken 0 skipped 0 warnings 0 <<<execution_status>>> initiation_status="ok" duration=0 termination_type=exited termination_id=1 corefile=no cutime=4 cstime=3 <<<test_end>>> <<<test_start>>> tag=cpuacct_1_10 stime=1741060128 cmdline="cpuacct.sh 1 10" contacts="" analysis=exit <<<test_output>>> cpuacct 1 TINFO: Running: cpuacct.sh 1 10 cpuacct 1 TINFO: timeout per run is 0h 5m 0s tst_pid.c:84: TINFO: Cannot read session user limits from '/sys/fs/cgroup/user.slice/user-1377975.slice/pids.max' tst_pid.c:94: TINFO: Found limit of processes 1648358 (from /sys/fs/cgroup/pids/user.slice/user-1377975.slice/pids.max) cpuacct 1 TINFO: task limit fulfilled (approximate need 10, limit 1646945) cpuacct 1 TINFO: cpuacct: /sys/fs/cgroup/cpuset,cpu,cpuacct cpuacct 1 TINFO: Creating 1 subgroups each with 10 processes cpuacct 1 TFAIL: cpuacct.usage is not equal to 0 for 1 subgroups cpuacct 1 TPASS: cpuacct.usage equal to subgroup*/cpuacct.usage cpuacct 2 TINFO: removing created directories Summary: passed 1 failed 1 broken 0 skipped 0 warnings 0 <<<execution_status>>> initiation_status="ok" duration=1 termination_type=exited termination_id=1 corefile=no cutime=20 cstime=4 <<<test_end>>> <<<test_start>>> tag=cpuacct_1_100 stime=1741060129 cmdline="cpuacct.sh 1 100" contacts="" analysis=exit <<<test_output>>> incrementing stop cpuacct 1 TINFO: Running: cpuacct.sh 1 100 cpuacct 1 TINFO: timeout per run is 0h 5m 0s tst_pid.c:84: TINFO: Cannot read session user limits from '/sys/fs/cgroup/user.slice/user-1377975.slice/pids.max' tst_pid.c:94: TINFO: Found limit of processes 1648358 (from /sys/fs/cgroup/pids/user.slice/user-1377975.slice/pids.max) cpuacct 1 TINFO: task limit fulfilled (approximate need 100, limit 1646945) cpuacct 1 TINFO: cpuacct: /sys/fs/cgroup/cpuset,cpu,cpuacct cpuacct 1 TINFO: Creating 1 subgroups each with 100 processes cpuacct 1 TFAIL: cpuacct.usage is not equal to 0 for 1 subgroups cpuacct 1 TPASS: cpuacct.usage equal to subgroup*/cpuacct.usage cpuacct 2 TINFO: removing created directories Summary: passed 1 failed 1 broken 0 skipped 0 warnings 0 <<<execution_status>>> initiation_status="ok" duration=0 termination_type=exited termination_id=1 corefile=no cutime=158 cstime=13 <<<test_end>>> [重现概率]: 必现 [重现环境]: 环境信息:倚天710机器 100.82.243.208 #uname -r 6.6.71-3_rc1.al8.aarch64 #cat /etc/os-release NAME="Alibaba Cloud Linux" VERSION="3 (Soaring Falcon)" ID="alinux" ID_LIKE="rhel fedora centos anolis" VERSION_ID="3" UPDATE_ID="10" PLATFORM_ID="platform:al8" PRETTY_NAME="Alibaba Cloud Linux 3 (Soaring Falcon)" ANSI_COLOR="0;31" HOME_URL="https://www.aliyun.com/" #lscpu Architecture: aarch64 Byte Order: Little Endian CPU(s): 124 On-line CPU(s) list: 0-123 Thread(s) per core: 1 Core(s) per socket: 124 Socket(s): 1 NUMA node(s): 2 Vendor ID: ARM BIOS Vendor ID: T-HEAD Model: 0 Model name: Neoverse-N2 BIOS Model name: Yitian710-124 Stepping: r0p0 CPU MHz: 2750.002 BogoMIPS: 100.00 Hypervisor vendor: Alibaba Virtualization type: full L1d cache: 64K L1i cache: 64K L2 cache: 1024K L3 cache: 65536K NUMA node0 CPU(s): 0-61 NUMA node1 CPU(s): 62-123 Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh #free -h total used free shared buff/cache available Mem: 251Gi 5.7Gi 243Gi 9.0Mi 4.0Gi 245Gi Swap: 2.0Gi 116Mi 1.9Gi #cat /proc/cmdline BOOT_IMAGE=(hd0,gpt2)/boot/vmlinuz-6.6.71-3_rc1.al8.aarch64 root=UUID=5d4c9cac-5324-464b-8971-09deff261ae7 ro biosdevname=0 rd.driver.pre=ahci iommu.passthrough=1 iommu.strict=0 nospectre_bhb ssbd=force-off systemd.unified_cgroup_hierarchy=0 cgroup.memory=nokmem console=ttyS0,115200 fsck.repair=yes crashkernel=0M-2G:0M,2G-256G:256M,256G-1024G:320M,1024G-:384M #rpm -qa | grep kernel | grep 6.6.71-3_rc1.al8 kernel-devel-6.6.71-3_rc1.al8.aarch64 kernel-headers-6.6.71-3_rc1.al8.aarch64 kernel-debuginfo-6.6.71-3_rc1.al8.aarch64 kernel-6.6.71-3_rc1.al8.aarch64 kernel-debuginfo-common-aarch64-6.6.71-3_rc1.al8.aarch64 #mount |grep cgroup tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) cgroup on /sys/fs/cgroup/cpuset,cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset,cpu,cpuacct) cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) cgroup2 on /tmp/ltp-uAjZp8fYTs/cgroup_unified type cgroup2 (rw,relatime,memory_recursiveprot) [重现步骤]: # 下载并编译用例 git clone http://gitlab.alibaba-inc.com/alikernel/ltp.git -b Ali6000 cd ltp make autotools ./configure make make install # 执行用例 cd /opt/ltp ./runltp -f controllers -s cpuacct_1_1 相同问题的用例有: cpuacct_1_10 cpuacct_10_10 cpuacct_1_100 cpuacct_100_1 cpuacct_100_100 [期望结果]: 用例执行PASS [实际结果]: 用例执行Fail [分析] 涉及代码如下: vim testcases/bin/cpuacct.sh do_test() { tst_res TINFO "Creating $max subgroups each with $nbprocess processes" # create and attach process to subgroups for i in `seq 1 $max`; do for j in `seq 1 $nbprocess`; do cpuacct_task $testpath/subgroup_$i/tasks & echo $! >> task_pids done done for pid in $(cat task_pids); do wait $pid; done rm -f task_pids acc=0 fails=0 for i in `seq 1 $max`; do tmp=`cat $testpath/subgroup_$i/cpuacct.usage` if [ "$tmp" -eq "0" ]; then fails=$((fails + 1)) fi acc=$((acc + tmp)) done ## check that cpuacct.usage != 0 for every subgroup if [ "$fails" -gt "0" ]; then tst_res TFAIL "cpuacct.usage is not equal to 0 for $fails subgroups" else tst_res TPASS "cpuacct.usage is not equal to 0 for every subgroup" fi ## check that ltp_subgroup/cpuacct.usage == sum ltp_subgroup/subgroup*/cpuacct.usage ref=`cat $testpath/cpuacct.usage` if [ "$ref" -ne "$acc" ]; then tst_res TFAIL "cpuacct.usage $ref not equal to subgroup*/cpuacct.usage $acc" else tst_res TPASS "cpuacct.usage equal to subgroup*/cpuacct.usage" fi }
用例问题,机器使用cgroupv1时如果是把cpuset和cpu,cpuacct挂载在一起的,需要创建cgroup后为cgroup设置cpuset.cpus和cpuset.mems,否则无法正常添加任务到cgroup中进行测试。