Bug 4629 - [内部nightly][Alibaba Cloud Linux 3][x86_64 & aarch64]alitests中用例memcg_qos执行失败,测试memory.min子目录内存未按预期比例分配,测试momory.low子目录内存分配与预期不符
Summary: [内部nightly][Alibaba Cloud Linux 3][x86_64 & aarch64]alitests中用例memcg_qos执行失败,...
Status: NEW
Alias: None
Product: Antest
Classification: Infrastructures
Component: 测试用例 (show other bugs) 测试用例
Version: unspecified
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: Jacob
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-03-28 18:29 UTC by wangpingping
Modified: 2023-03-29 09:51 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description wangpingping alibaba_cloud_group 2023-03-28 18:29:47 UTC
[缺陷描述]:
alitests中用例memcg_qos执行失败,测试memory.min子目录内存未按预期比例分配,测试momory.low子目录内存分配与预期不符

之前失败是:Alinux2 内核移植了部分cgroup v2特性到v1, Alinux3 这部分还没有移植:
alinux: mm,memcg: export memory.{min,low} to cgroup v1

现在接口已经有了,但是值不符合预期

失败日志如下:
# ./memcg_qos
tst_test.c:1066: INFO: Timeout per run is 1h 00m 00s
TEST PASS: test_memcg_current
child 0~3 min: 29M 21M 0M 0M
TEST FAIL: test_memcg_min
child 0~3 low stage0: 50M 50M 0M 50M
TEST FAIL: test_memcg_low
TEST PASS: test_memcg_high
TEST PASS: test_memcg_max
memcg_qos.c:1038: FAIL: memcg QoS


Summary:
passed   0
failed   1
skipped  0
warnings 0


[重现概率]:
必现

[重现环境]:
环境:线下vm
内核:
# uname -r
5.10.134-876.git.3c4c5575a42e.al8.aarch64
# cat /etc/os-release
NAME="Alibaba Cloud Linux"
VERSION="3 (Soaring Falcon)"
ID="alinux"
ID_LIKE="rhel fedora centos anolis"
VERSION_ID="3"
PLATFORM_ID="platform:al8"
PRETTY_NAME="Alibaba Cloud Linux 3 (Soaring Falcon)"
ANSI_COLOR="0;31"
HOME_URL="https://www.aliyun.com/"

CPU信息:
# lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  1
Core(s) per cluster: 4
Socket(s):           4
Cluster(s):          1
NUMA node(s):        1
Vendor ID:           HiSilicon
BIOS Vendor ID:      Alibaba Cloud
Model:               0
Model name:          Kunpeng-920
BIOS Model name:     virt-rhel7.6.0
Stepping:            0x1
BogoMIPS:            200.00
NUMA node0 CPU(s):   0-3
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm

内存信息:
# free -h
              total        used        free      shared  buff/cache   available
Mem:           15Gi       601Mi        12Gi        32Mi       1.8Gi        14Gi
Swap:            0B          0B          0B

[重现步骤]:
# 编译用例
git clone  http://gitlab.alibaba-inc.com/alikernel/ltp 
export BUILD_ALITESTS_ONLY=yes 
cd ltp 
export CFLAGS="-fcommon"
make autotools
./configure
make
make install

# 执行用例
 /opt/ltp/testcases/bin/memcg_qos

[预期结果]:
用例执行成功

[实际结果]:
用例执行失败


[问题定位]:
失败处代码逻辑如下:

test_memcg_min
648行
    648         if (!values_close(c[1], MB(17), 10))
    649                 goto cleanup;
    650


test_memcg_low
    802         if (!values_close(c[1], MB(25), 10))
    803                 goto cleanup;


    802         if (!values_close(c[1], MB(25), 10))
    803                 goto cleanup;


用例执行逻辑及结果如下:
test_memcg_min
目录	设置	          |  预期
A	memory.min = 50M                                                                                 
A/B	memory.min = 50M  |  预期memory.usage_in_bytes~=50M(pass) 
A/B/C	memory.min = 75M  |  预期memory.usage_in_bytes~=33M(pass)
A/B/D	memory.min = 25M  |  预期memory.usage_in_bytes~=17M(fail), 实际值为 21M
A/B/E	memory.min = 500M |  预期memory.usage_in_bytes~=0(pass)
A/B/F	memory.min = 0

A/G	|   malloc 148M



test_memcg_low
目录	设置	            |	预期1	预期2

A	memory.low = 50M
A/B	memory.low = 50M | memory.usage_in_bytes~=50M(pass)| memory.usage_in_bytes~=50M(pass) 
A/B/C	memory.low = 75M  | memory.usage_in_bytes~=50M(pass)                         | memory.usage_in_bytes~=33M(pass) 
A/B/D	memory.low = 25M  | memory.usage_in_bytes~=25M(fail),实际值为 50M    |  memory.usage_in_bytes~=17M(pass) 
A/B/E	memory.low = 500M, | memory.usage_in_bytes~=0(pass)
A/B/F	memory.low = 0

A/G	| malloc 100M  | malloc 148M


从上面结果可以看到 test_memcg_min 测试中的预期值与实际值差距不大,通过修改误差范围可以pass,但 test_memcg_low 测试中的实际值显然不符合预期
Comment 1 wangpingping alibaba_cloud_group 2023-03-29 09:51:17 UTC
内部缺陷平台ali5000有相同问题,状态为later