[缺陷描述]: kernel-selftests测试cgroup/test_zswap执行失败。 [重现概率]: 必现 [重现步骤] 1. 下载kernel-6.6.25-2_rc1.an23.src.rpm 2. rpm -i kernel-6.6.25-2_rc1.an23.src.rpm 3. yum-builddep -y /root/rpmbuild/SPECS/kernel.spec rpmbuild -bp /root/rpmbuild/SPECS/kernel.spec cd /root/rpmbuild/BUILD/kernel-6.6.25-2_rc1.an23/linux-6.6.25-2_rc1.an23.aarch64/tools/testing/selftests/cgroup 4. make;./test_zswap [期望结果]: 用例执行PASS [实际结果]: [root@iZbp143ti4ccpaufkzata6Z cgroup]# ./test_zswap ok 1 # SKIP test_no_kmem_bypass not ok 2 test_no_invasive_cgroup_shrink [root@iZbp143ti4ccpaufkzata6Z cgroup]# pwd /root/rpmbuild/BUILD/kernel-6.6.25-2_rc1.an23/linux-6.6.25-2_rc1.an23.aarch64/tools/testing/selftests/kselftest_install/cgroup [root@iZbp143ti4ccpaufkzata6Z cgroup]# uname -r 6.6.25-2_rc1.an23.aarch64 [重现环境]: 环境信息:云上ecs [root@iZbp143ti4ccpaufkzata6Z breakpoints]# uname -ra Linux iZbp143ti4ccpaufkzata6Z 6.6.25-2_rc1.an23.aarch64 #1 SMP PREEMPT_DYNAMIC Thu Apr 11 15:02:38 CST 2024 aarch64 aarch64 aarch64 GNU/Linux [root@iZbp143ti4ccpaufkzata6Z breakpoints]# [root@iZbp143ti4ccpaufkzata6Z breakpoints]# cat /etc/os-release NAME="Anolis OS" VERSION="23" ID="anolis" VERSION_ID="23" PLATFORM_ID="platform:an23" PRETTY_NAME="Anolis OS 23" ANSI_COLOR="0;31" HOME_URL="https://openanolis.cn/" BUG_REPORT_URL="https://bugzilla.openanolis.cn/" [root@iZbp143ti4ccpaufkzata6Z breakpoints]# [root@iZbp143ti4ccpaufkzata6Z breakpoints]# [root@iZbp143ti4ccpaufkzata6Z breakpoints]# df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 4.0M 0 4.0M 0% /dev tmpfs 16G 0 16G 0% /dev/shm tmpfs 6.1G 804K 6.1G 1% /run efivarfs 256K 18K 239K 7% /sys/firmware/efi/efivars /dev/nvme0n1p2 40G 13G 27G 33% / tmpfs 16G 3.1M 16G 1% /tmp /dev/nvme0n1p1 500M 6.5M 494M 2% /boot/efi tmpfs 3.1G 4.0K 3.1G 1% /run/user/0 [root@iZbp143ti4ccpaufkzata6Z breakpoints]# [root@iZbp143ti4ccpaufkzata6Z breakpoints]# free -g total used free shared buff/cache available Mem: 30 0 28 0 1 29 Swap: 0 0 0 [root@iZbp143ti4ccpaufkzata6Z breakpoints]# [root@iZbp143ti4ccpaufkzata6Z breakpoints]# cat /proc/cmdline BOOT_IMAGE=(hd0,gpt2)/boot/vmlinuz-6.6.25-2_rc1.an23.aarch64 root=UUID=6424d533-3c41-4ad9-89fa-1d3bf8c49fd3 ro rhgb crashkernel=0M-2G:0M,2G-64G:256M,64G-:384M iommu.passthrough=1 iommu.strict=0 cryptomgr.notests cgroup.memory=nokmem rcupdate.rcu_cpu_stall_timeout=300 quiet selinux=1 console=tty0 biosdevname=0 net.ifnames=0 console=ttyAMA0,115200n8 noibrs nvme_core.io_timeout=4294967295 nvme_core.admin_timeout=4294967295 [root@iZbp143ti4ccpaufkzata6Z breakpoints]# [root@iZbp143ti4ccpaufkzata6Z breakpoints]# lscpu Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: ARM BIOS Vendor ID: Alibaba Cloud Model name: Neoverse-N2 BIOS Model name: virt-rhel7.6.0 CPU @ 2.0GHz BIOS CPU family: 1 Model: 0 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 1 Stepping: r0p0 BogoMIPS: 100.00 Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm 3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb dcpodp sve2 sveaes svepmull svebitperm svesh a3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh Caches (sum of all): L1d: 512 KiB (8 instances) L1i: 512 KiB (8 instances) L2: 8 MiB (8 instances) L3: 64 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-7 Vulnerabilities: Gather data sampling: Not affected Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Mmio stale data: Not affected Reg file data sampling: Not affected Retbleed: Not affected Spec rstack overflow: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Spectre v1: Mitigation; __user pointer sanitization Spectre v2: Mitigation; CSV2, BHB Srbds: Not affected Tsx async abort: Not affected
测试用例脚本如下: /* * When trying to store a memcg page in zswap, if the memcg hits its memory * limit in zswap, writeback should not be triggered. * * This was fixed with commit 0bdf0efa180a("zswap: do not shrink if cgroup may * not zswap"). Needs to be revised when a per memcg writeback mechanism is * implemented. */ static int test_no_invasive_cgroup_shrink(const char *root) { size_t written_back_before, written_back_after; int ret = KSFT_FAIL; char *test_group; /* Set up */ test_group = cg_name(root, "no_shrink_test"); if (!test_group) goto out; if (cg_create(test_group)) goto out; if (cg_write(test_group, "memory.max", "1M")) goto out; if (cg_write(test_group, "memory.zswap.max", "10K")) goto out; if (get_zswap_written_back_pages(&written_back_before)) goto out; /* Allocate 10x memory.max to push memory into zswap */ if (cg_run(test_group, allocate_bytes, (void *)MB(10))) goto out; /* Verify that no writeback happened because of the memcg allocation */ if (get_zswap_written_back_pages(&written_back_after)) goto out; if (written_back_after == written_back_before) ret = KSFT_PASS; out: cg_destroy(test_group); free(test_group); return ret; } strace 跟一下,问题出在这里 openat(AT_FDCWD, "/sys/fs/cgroup/no_shrink_test/memory.zswap.max", O_WRONLY|O_APPEND) = 3 write(3, "10K", 3) = 3 close(3) = 0 openat(AT_FDCWD, "/sys/kernel/debug/zswap/written_back_pages", O_RDONLY) = -1 ENOENT (No such file or directory) unlinkat(AT_FDCWD, "/sys/fs/cgroup/no_shrink_test", AT_REMOVEDIR) = 0 write(1, "not ok 2 test_no_invasive_cgroup"..., 40not ok 2 test_no_invasive_cgroup_shrink ) = 40 exit_group(1) = ? +++ exited with 1 +++ 回到脚本,问题出在get_zswap_written_back_pages 这个函数这。 53 static int get_zswap_written_back_pages(size_t *value) 54 { 55 return read_int("/sys/kernel/debug/zswap/written_back_pages", value); 56 } 系统中缺少/sys/kernel/debug/zswap/written_back_pages 这个文件,无法获取回写的值。 请研发确认缺少这个文件是否符合预期。
根据 mm/zswap.c 中的定义和实现,/sys/kernel/debug/zswap/written_back_pages 文件在开启 CONFIG_DEBUG_FS 选项后就存在
(In reply to hr567q from comment #2) > 根据 mm/zswap.c 中的定义和实现,/sys/kernel/debug/zswap/written_back_pages 文件在开启 > CONFIG_DEBUG_FS 选项后就存在 根据仓库 https://gitee.com/src-anolis-sig/cloud-kernel.git#origin/an23-6.6 中的 config 配置所有 config 均开启 CONFIG_DEBUG_FS
问题原因是在运行测试前内核未启用 zswap 功能,后续可以通过适配测试脚本提前开启 zswap 能力或者在内核 config 中默认使能解决该问题。
echo 1 > /sys/module/zswap/parameters/enabled 在测试前通过上述命令在运行时开启zswap功能即可
进行zswap测试前机器上需要至少设置一定大小的swap空间才能进行这个测试,经测试设置16M swap空间足以进行zswap测试
同时需要设置 vm.swappiness 参数
记录下 在anolis23 6.6.25-2.2_rc1.an23.x86_64内核环境下也存在 ================================ [root@5f9Lab15 cgroup]# uname -r 6.6.25-2.2_rc1.an23.x86_64 [root@5f9Lab15 cgroup]# pwd /root/rpmbuild/BUILD/kernel-6.6.25-2.2_rc1.an23/linux-6.6.25-2.2_rc1.an23.x86_64/tools/testing/selftests/cgroup [root@5f9Lab15 cgroup]# ./test_zswap ok 1 # SKIP test_no_kmem_bypass not ok 2 test_no_invasive_cgroup_shrink
(In reply to anolislw from comment #8) > 记录下 在anolis23 6.6.25-2.2_rc1.an23.x86_64内核环境下也存在 > ================================ > [root@5f9Lab15 cgroup]# uname -r > 6.6.25-2.2_rc1.an23.x86_64 > [root@5f9Lab15 cgroup]# pwd > /root/rpmbuild/BUILD/kernel-6.6.25-2.2_rc1.an23/linux-6.6.25-2.2_rc1.an23. > x86_64/tools/testing/selftests/cgroup > [root@5f9Lab15 cgroup]# ./test_zswap > ok 1 # SKIP test_no_kmem_bypass > not ok 2 test_no_invasive_cgroup_shrink 设置echo 1 > /sys/module/zswap/parameters/enabled case可以pass ------------------ [root@5f9Lab15 cgroup]# ./test_zswap ok 1 # SKIP test_no_kmem_bypass ok 2 test_no_invasive_cgroup_shrink [root@5f9Lab15 cgroup]# uname -r 6.6.25-2.2_rc1.an23.x86_64 [root@5f9Lab15 cgroup]# pwd /root/rpmbuild/BUILD/kernel-6.6.25-2.2_rc1.an23/linux-6.6.25-2.2_rc1.an23.x86_64/tools/testing/selftests/cgroup