Bug 2194 - [Anck 5.10 aarch64][nightly]alitests下6条coldpgs相关的用例Can't get enough cold page cache
Summary: [Anck 5.10 aarch64][nightly]alitests下6条coldpgs相关的用例Can't get enough cold page...
Status: NEW
Alias: None
Product: Anolis OS 8
Classification: Anolis OS
Component: Others (show other bugs) Others
Version: 8.2
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: shuming
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-09-16 11:18 UTC by yunhe123
Modified: 2023-07-05 16:02 UTC (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description yunhe123 alibaba_cloud_group 2022-09-16 11:18:17 UTC
[缺陷描述]:
Anck 5.10 aarch64 alitests下6条coldpgs相关的用例Can't get enough cold page cache
失败用例:
coldpgs_global_reclaim01
coldpgs_global_reclaim02
coldpgs_global_reclaim03
coldpgs_memcg_reclaim02
coldpgs_memcg_reclaim02
coldpgs_memcg_reclaim03

失败日志如下:
<<<test_start>>>
tag=coldpgs_global_reclaim01 stime=1663266281
cmdline="coldpgs_global_reclaim01.sh"
contacts=""
analysis=exit
<<<test_output>>>
coldpgs_global_reclaim01 1 TINFO: Create test file: test.data 128M
128+0 records in
128+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 0.149666 s, 897 MB/s
coldpgs_global_reclaim01 1 TINFO: Test with memory cgroup: /sys/fs/cgroup/memory/test1855999
262144+0 records in
262144+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 0.721505 s, 186 MB/s
coldpgs_global_reclaim01 1 TFAIL: Can't get enough cold page cache

Summary:
passed   0
failed   1
skipped  0
warnings 0
<<<execution_status>>>
initiation_status="ok"
duration=303 termination_type=exited termination_id=1 corefile=no
cutime=73 cstime=170
<<<test_end>>>

<<<test_start>>>
tag=coldpgs_global_reclaim02 stime=1663266584
cmdline="coldpgs_global_reclaim02.sh"
contacts=""
analysis=exit
<<<test_output>>>
coldpgs_global_reclaim02 1 TINFO: Set scan_period_in_seconds = 15
coldpgs_global_reclaim02 1 TINFO: Set scan_target = page_cache
coldpgs_global_reclaim02 1 TINFO: Create 3 test files (each file 128M)
coldpgs_global_reclaim02 1 TINFO: Create 3 testing cgroups
coldpgs_global_reclaim02 1 TINFO: Load page caches
load_pagecaches -c /sys/fs/cgroup/memory/test856730/cgtest1/tasks /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim02.3kmay6whYL/testfile1
load_pagecaches -c /sys/fs/cgroup/memory/test856730/cgtest2/tasks /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim02.3kmay6whYL/testfile2
load_pagecaches -c /sys/fs/cgroup/memory/test856730/cgtest3/tasks /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim02.3kmay6whYL/testfile3
coldpgs_global_reclaim02 1 TINFO: wait some time to get 402653184 bytes memory idle
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim02.3kmay6whYL/testfile1 (size: 134217728): done
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim02.3kmay6whYL/testfile2 (size: 134217728): done
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim02.3kmay6whYL/testfile3 (size: 134217728): done
coldpgs_global_reclaim02 1 TINFO: Set /sys/kernel/mm/coldpgs/threshold 2
coldpgs_global_reclaim02 1 TINFO: Expected memory dropped: 393216
coldpgs_global_reclaim02 1 TINFO: Actual memory dropped: 312124
coldpgs_global_reclaim02 1 TFAIL: Failed to drop cold page caches
coldpgs_global_reclaim02 1 TINFO: dropped pagecage in memory.coldpgs.stats: 312124
coldpgs_global_reclaim02 1 TINFO: dropped pagecage in system view: -311984
coldpgs_global_reclaim02 1 TINFO: Stop background worker tasks

Summary:
passed   0
failed   1
skipped  0
warnings 0
<<<execution_status>>>
initiation_status="ok"
duration=1022 termination_type=exited termination_id=1 corefile=no
cutime=656 cstime=684
<<<test_end>>>


<<<test_start>>>
tag=coldpgs_global_reclaim03 stime=1663267606
cmdline="coldpgs_global_reclaim03.sh"
contacts=""
analysis=exit
<<<test_output>>>
coldpgs_global_reclaim03 1 TINFO: Set scan_period_in_seconds = 15
coldpgs_global_reclaim03 1 TINFO: Set scan_target = page_cache
coldpgs_global_reclaim03 1 TINFO: Create 3 test files (each file 128M)
coldpgs_global_reclaim03 1 TINFO: Create 3 testing cgroups
coldpgs_global_reclaim03 1 TINFO: Load page caches
load_pagecaches -c /sys/fs/cgroup/memory/test3232622/cgtest1/tasks -w -l /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim03.ChvndCUnis/testfile1
load_pagecaches -c /sys/fs/cgroup/memory/test3232622/cgtest2/tasks -w -l /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim03.ChvndCUnis/testfile2
load_pagecaches -c /sys/fs/cgroup/memory/test3232622/cgtest3/tasks -w -l /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim03.ChvndCUnis/testfile3
coldpgs_global_reclaim03 1 TINFO: wait some time to get 402653184 bytes memory idle
coldpgs_global_reclaim03 1 TFAIL: failed to get 402653184 bytes memory idle
coldpgs_global_reclaim03 1 TINFO: Set /sys/kernel/mm/coldpgs/threshold 2
coldpgs_global_reclaim03 1 TINFO: Expected memory dropped: 0
coldpgs_global_reclaim03 1 TINFO: Actual memory dropped: 0
coldpgs_global_reclaim03 1 TPASS: With reclaim_locked_mem=0, 0 KB locked memory is reclaimed as expected
coldpgs_global_reclaim03 1 TINFO: Stop background worker tasks
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim03.ChvndCUnis/testfile1 (size: 134217728): done
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim03.ChvndCUnis/testfile2 (size: 134217728): done
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim03.ChvndCUnis/testfile3 (size: 134217728): done
coldpgs_global_reclaim03 2 TINFO: Set scan_period_in_seconds = 15
coldpgs_global_reclaim03 2 TINFO: Set scan_target = page_cache
coldpgs_global_reclaim03 2 TINFO: Create 3 test files (each file 128M)
coldpgs_global_reclaim03 2 TINFO: Create 3 testing cgroups
coldpgs_global_reclaim03 2 TINFO: Load page caches
load_pagecaches -c /sys/fs/cgroup/memory/test3232622/cgtest1/tasks -w -l /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim03.ChvndCUnis/testfile1
load_pagecaches -c /sys/fs/cgroup/memory/test3232622/cgtest2/tasks -w -l /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim03.ChvndCUnis/testfile2
load_pagecaches -c /sys/fs/cgroup/memory/test3232622/cgtest3/tasks -w -l /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim03.ChvndCUnis/testfile3
coldpgs_global_reclaim03 2 TINFO: wait some time to get 402653184 bytes memory idle
coldpgs_global_reclaim03 2 TFAIL: failed to get 402653184 bytes memory idle
coldpgs_global_reclaim03 2 TINFO: Set /sys/kernel/mm/coldpgs/threshold 2
coldpgs_global_reclaim03 2 TINFO: Expected memory dropped: 393216
coldpgs_global_reclaim03 2 TINFO: Actual memory dropped: 115984
coldpgs_global_reclaim03 2 TFAIL: With reclaim_locked_mem=1,  115984 KB locked memory is reclaimed, but expect 393216
coldpgs_global_reclaim03 2 TINFO: dropped pagecage in system view: -115784 KB
coldpgs_global_reclaim03 2 TINFO: Stop background worker tasks
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim03.ChvndCUnis/testfile1 (size: 134217728): done
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim03.ChvndCUnis/testfile2 (size: 134217728): done
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_global_reclaim03.ChvndCUnis/testfile3 (size: 134217728): done

Summary:
passed   1
failed   3
skipped  0
warnings 0
<<<execution_status>>>
initiation_status="ok"
duration=2070 termination_type=exited termination_id=1 corefile=no
cutime=1396 cstime=1479
<<<test_end>>>


<<<test_start>>>
tag=coldpgs_memcg_reclaim01 stime=1663269676
cmdline=" coldpgs_memcg_reclaim01.sh"
contacts=""
analysis=exit
<<<test_output>>>
coldpgs_memcg_reclaim01 1 TINFO: Create test file: test.data 128M
128+0 records in
128+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 0.149454 s, 898 MB/s
coldpgs_memcg_reclaim01 1 TINFO: Test with memory cgroup: /sys/fs/cgroup/memory/test4078749
262144+0 records in
262144+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 0.716944 s, 187 MB/s
coldpgs_memcg_reclaim01 1 TFAIL: Can't get enough cold page cache

Summary:
passed   0
failed   1
skipped  0
warnings 0
<<<execution_status>>>
initiation_status="ok"
duration=303 termination_type=exited termination_id=1 corefile=no
cutime=75 cstime=166
<<<test_end>>>
<<<test_start>>>
tag=coldpgs_memcg_reclaim02 stime=1663269979
cmdline=" coldpgs_memcg_reclaim02.sh"
contacts=""
analysis=exit
<<<test_output>>>
coldpgs_memcg_reclaim02 1 TINFO: Set scan_period_in_seconds = 15
coldpgs_memcg_reclaim02 1 TINFO: Set scan_target = page_cache
coldpgs_memcg_reclaim02 1 TINFO: Create 3 test files (each file 128M)
coldpgs_memcg_reclaim02 1 TINFO: Create 3 testing cgroups
coldpgs_memcg_reclaim02 1 TINFO: Load page caches
load_pagecaches -c /sys/fs/cgroup/memory/test3077685/cgtest1/tasks /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim02.vXKbpN9qIy/testfile1
load_pagecaches -c /sys/fs/cgroup/memory/test3077685/cgtest2/tasks /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim02.vXKbpN9qIy/testfile2
load_pagecaches -c /sys/fs/cgroup/memory/test3077685/cgtest3/tasks /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim02.vXKbpN9qIy/testfile3
coldpgs_memcg_reclaim02 1 TINFO: wait some time to get 402653184 bytes memory idle
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim02.vXKbpN9qIy/testfile1 (size: 134217728): done
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim02.vXKbpN9qIy/testfile2 (size: 134217728): done
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim02.vXKbpN9qIy/testfile3 (size: 134217728): done
coldpgs_memcg_reclaim02 1 TINFO: Set memory.coldpgs.threshold = 2
coldpgs_memcg_reclaim02 1 TINFO: Set memory.coldpgs.size = 134217728
coldpgs_memcg_reclaim02 1 TINFO: Expected memory dropped: 393216
coldpgs_memcg_reclaim02 1 TINFO: Actual memory dropped: 298212
coldpgs_memcg_reclaim02 1 TFAIL: Failed to drop cold page caches
coldpgs_memcg_reclaim02 1 TINFO: dropped pagecage in system view: -297368
coldpgs_memcg_reclaim02 1 TINFO: Stop background worker tasks

Summary:
passed   0
failed   1
skipped  0
warnings 0
<<<execution_status>>>
initiation_status="ok"
duration=1016 termination_type=exited termination_id=1 corefile=no
cutime=629 cstime=698
<<<test_end>>>


<<<test_start>>>
tag=coldpgs_memcg_reclaim03 stime=1663270995
cmdline=" coldpgs_memcg_reclaim03.sh"
contacts=""
analysis=exit
<<<test_output>>>
coldpgs_memcg_reclaim03 1 TINFO: Set scan_period_in_seconds = 15
coldpgs_memcg_reclaim03 1 TINFO: Set scan_target = page_cache
coldpgs_memcg_reclaim03 1 TINFO: Create 3 test files (each file 128M)
coldpgs_memcg_reclaim03 1 TINFO: Create 3 testing cgroups
coldpgs_memcg_reclaim03 1 TINFO: Load page caches
load_pagecaches -c /sys/fs/cgroup/memory/test1202199/cgtest1/tasks -w -l /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim03.TcSZf1S376/testfile1
load_pagecaches -c /sys/fs/cgroup/memory/test1202199/cgtest2/tasks -w -l /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim03.TcSZf1S376/testfile2
load_pagecaches -c /sys/fs/cgroup/memory/test1202199/cgtest3/tasks -w -l /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim03.TcSZf1S376/testfile3
coldpgs_memcg_reclaim03 1 TINFO: wait some time to get 402653184 bytes memory idle
coldpgs_memcg_reclaim03 1 TFAIL: failed to get 402653184 bytes memory idle
coldpgs_memcg_reclaim03 1 TINFO: Set memory.coldpgs.threshold = 2
coldpgs_memcg_reclaim03 1 TINFO: Set memory.coldpgs.size = 134217728
coldpgs_memcg_reclaim03 1 TINFO: Expected memory dropped: 0
coldpgs_memcg_reclaim03 1 TINFO: Actual memory dropped: 0
coldpgs_memcg_reclaim03 1 TPASS: With reclaim_locked_mem=0, 0 KB locked memory is reclaimed as expected
coldpgs_memcg_reclaim03 1 TINFO: Stop background worker tasks
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim03.TcSZf1S376/testfile1 (size: 134217728): done
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim03.TcSZf1S376/testfile2 (size: 134217728): done
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim03.TcSZf1S376/testfile3 (size: 134217728): done
coldpgs_memcg_reclaim03 2 TINFO: Set scan_period_in_seconds = 15
coldpgs_memcg_reclaim03 2 TINFO: Set scan_target = page_cache
coldpgs_memcg_reclaim03 2 TINFO: Create 3 test files (each file 128M)
coldpgs_memcg_reclaim03 2 TINFO: Create 3 testing cgroups
coldpgs_memcg_reclaim03 2 TINFO: Load page caches
load_pagecaches -c /sys/fs/cgroup/memory/test1202199/cgtest1/tasks -w -l /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim03.TcSZf1S376/testfile1
load_pagecaches -c /sys/fs/cgroup/memory/test1202199/cgtest2/tasks -w -l /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim03.TcSZf1S376/testfile2
load_pagecaches -c /sys/fs/cgroup/memory/test1202199/cgtest3/tasks -w -l /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim03.TcSZf1S376/testfile3
coldpgs_memcg_reclaim03 2 TINFO: wait some time to get 402653184 bytes memory idle
coldpgs_memcg_reclaim03 2 TFAIL: failed to get 402653184 bytes memory idle
coldpgs_memcg_reclaim03 2 TINFO: Set memory.coldpgs.threshold = 2
coldpgs_memcg_reclaim03 2 TINFO: Set memory.coldpgs.size = 134217728
coldpgs_memcg_reclaim03 2 TINFO: Expected memory dropped: 393216
coldpgs_memcg_reclaim03 2 TINFO: Actual memory dropped: 91428
coldpgs_memcg_reclaim03 2 TFAIL: With reclaim_locked_mem=1,  91428 KB locked memory is reclaimed, but expect 393216
coldpgs_memcg_reclaim03 2 TINFO: dropped pagecage in system view: -90720 KB
coldpgs_memcg_reclaim03 2 TINFO: Stop background worker tasks
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim03.TcSZf1S376/testfile1 (size: 134217728): done
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim03.TcSZf1S376/testfile2 (size: 134217728): done
Read through the file /tmp/ltp-MDxPc8Nnc3/LTP_coldpgs_memcg_reclaim03.TcSZf1S376/testfile3 (size: 134217728): done

Summary:
passed   1
failed   3
skipped  0
warnings 0
<<<execution_status>>>
initiation_status="ok"
duration=2070 termination_type=exited termination_id=1 corefile=no
cutime=1361 cstime=1466
<<<test_end>>>

日志详情:
https://sam-autotest.oss-cn-hangzhou-zmf.aliyuncs.com/103536/alitests_16632449828511974437/1/alitests.run.log?OSSAccessKeyId=LTAIV2q6AXNHBUnv&Expires=1663304555&Signature=F3GrOWlkT6iTUoF0/jMykpH1j3A%3D

内核信息:
uname -r
5.10.134-390.git.340957ba0.an8.aarch64

版本信息:
cat /etc/os-release
NAME="Anolis OS"
VERSION="8.2"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="8.2"
PLATFORM_ID="platform:an8"
PRETTY_NAME="Anolis OS 8.2"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.org/"

内存信息:
free -g
              total        used        free      shared  buff/cache   available
Mem:            753           6         746           0           0         744

cpu信息:
lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              96
On-line CPU(s) list: 0-95
Thread(s) per core:  1
Core(s) per socket:  48
Socket(s):           2
NUMA node(s):        1
Vendor ID:           HiSilicon
BIOS Vendor ID:      HiSilicon
Model:               0
Model name:          Kunpeng-920
BIOS Model name:     HUAWEI Kunpeng 920 5250
Stepping:            0x1
CPU max MHz:         2600.0000
CPU min MHz:         200.0000
BogoMIPS:            200.00
L1d cache:           64K
L1i cache:           64K
L2 cache:            512K
L3 cache:            24576K
NUMA node0 CPU(s):   0-95
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm


[问题发生概率]:必现

[复现步骤]:

git clone http://ktester:Test\(12\)@gitlab-sp.alibaba-inc.com/alikernel/ltp.git
cd ltp
make autotools
./configure
make
make install
./runltp -f alitests -s coldpgs_memcg_reclaim01

[期望结果]:
用例正常

[实际结果]:
用例fail
Comment 1 yunhe123 alibaba_cloud_group 2023-07-05 16:02:42 UTC
当前只有coldpgs_global_reclaim03和coldpgs_memcg_reclaim03,这两条用例失败,日志同上。