Bug 11812 - [Anolis23.2][6.6.25-2.2_rc1][x86_64] kernel-selftests测试,bpf.test_dev_cgroup执行Fail,(cgroup_helpers.c:93: errno: Invalid argument) Enabling controller cpu: /mnt/cgroup.subtree_control
Summary: [Anolis23.2][6.6.25-2.2_rc1][x86_64] kernel-selftests测试,bpf.test_dev_cgroup执行...
Status: CONFIRMED
Alias: None
Product: ANCK 6.6 Dev
Classification: ANCK
Component: generic (show other bugs) generic
Version: 6.6.25-2.2
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: banye97
QA Contact:
URL:
Whiteboard:
Keywords: Function
Depends on:
Blocks:
 
Reported: 2024-11-12 11:41 UTC by anolislw
Modified: 2024-11-27 16:01 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description anolislw alibaba_cloud_group 2024-11-12 11:41:07 UTC
[问题描述]
内部的物理机anolis23 x86环境下,内核为6.6.25-2.2_rc1.an23.x86_64系统上,kernel-selftests测试,bpf.test_dev_cgroup执行Fail,提示 (cgroup_helpers.c:93: errno: Invalid argument) Enabling controller cpu: /mnt/cgroup.subtree_control


[实际结果]
[root@5f9Lab15 bpf]# ./test_dev_cgroup
(cgroup_helpers.c:93: errno: Invalid argument) Enabling controller cpu: /mnt/cgroup.subtree_control
Failed to setup cgroup environment
Failed to create test cgroup

[root@5f9Lab15 ltp]# uname -r
6.6.25-2.2_rc1.an23.x86_64


[期望结果]
case pass

[复现步骤]
1. wget https://build.openanolis.cn/kojifiles/output/nightly/anolis-23-20241101.5/compose/os/source/tree/Packages/kernel-6.6.25-2.2_rc1.an23.src.rpm
2. rpm -i kernel-6.6.25-2.2_rc1.an23.src.rpm
   yum-builddep -y /root/rpmbuild/SPECS/kernel.spec 
   rpmbuild -bp /root/rpmbuild/SPECS/kernel.spec 
   cd /root/rpmbuild/BUILD/kernel-*/linux-*
   make -C tools/testing/selftests/
   cd  tools/testing/selftests/
   ln -s /lib/debug/lib/modules/$(uname -r)/vmlinux /lib/modules/$(uname -r)/build/
   make KDIR=/lib/modules/$(uname -r)/build/ -C bpf/
   cd bpf;./test_dev_cgroup

[环境信息]
[root@5f9Lab15 resctrl]# uname -a
Linux 5f9Lab15 6.6.25-2.2_rc1.an23.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Oct 31 21:26:10 CST 2024 x86_64 x86_64 x86_64 GNU/Linux
[root@5f9Lab15 resctrl]#
[root@5f9Lab15 resctrl]# cat /proc/cmdline
BOOT_IMAGE=(hd1,gpt2)/vmlinuz-6.6.25-2.2_rc1.an23.x86_64 root=UUID=dca78281-9421-4a9b-9bb5-c4ec9804a355 ro resume=UUID=85dbb4f6-3cdd-4b68-b32e-ddfef04aaf1a rhgb quiet selinux=0 cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M
[root@5f9Lab15 resctrl]#
[root@5f9Lab15 resctrl]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda4       444G   18G  426G   5% /
devtmpfs        4.0M  600K  3.5M  15% /dev
tmpfs           126G     0  126G   0% /dev/shm
efivarfs        268K  164K  100K  63% /sys/firmware/efi/efivars
tmpfs            51G  2.5M   51G   1% /run
tmpfs           126G   56M  126G   1% /tmp
/dev/sda2       960M  234M  727M  25% /boot
/dev/sda1       200M  6.2M  194M   4% /boot/efi
tmpfs            26G   48K   26G   1% /run/user/0
[root@5f9Lab15 resctrl]#
[root@5f9Lab15 resctrl]# free -g
               total        used        free      shared  buff/cache   available
Mem:             251           3         249           0           0         248
Swap:              1           0           1
[root@5f9Lab15 resctrl]#
[root@5f9Lab15 resctrl]# lscpu
Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          46 bits physical, 57 bits virtual
  Byte Order:             Little Endian
CPU(s):                   48
  On-line CPU(s) list:    0-47
Vendor ID:                GenuineIntel
  BIOS Vendor ID:         Intel(R) Corporation
  Model name:             Intel(R) Xeon(R) Silver 4310 CPU @ 2.10GHz
    BIOS Model name:      Intel(R) Xeon(R) Silver 4310 CPU @ 2.10GHz  CPU @ 2.1GHz
    BIOS CPU family:      179
    CPU family:           6
    Model:                106
    Thread(s) per core:   2
    Core(s) per socket:   12
    Socket(s):            2
    Stepping:             6
    CPU(s) scaling MHz:   27%
    CPU max MHz:          3300.0000
    CPU min MHz:          800.0000
    BogoMIPS:             4200.00
    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss
                           ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonsto
                          p_tsc cpuid aperfmperf pni pclmulqdq dtes64 ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse
                          4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_f
                          ault epb cat_l3 ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_a
                          djust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflusho
                          pt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_m
                          bm_total cqm_mbm_local split_lock_detect wbnoinvd dtherm ida arat pln pts vnmi avx512vbmi umip pku ospke avx5
                          12_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq rdpid fsrm md_clear pconfig flus
                          h_l1d arch_capabilities
Virtualization features:
  Virtualization:         VT-x
Caches (sum of all):
  L1d:                    1.1 MiB (24 instances)
  L1i:                    768 KiB (24 instances)
  L2:                     30 MiB (24 instances)
  L3:                     36 MiB (2 instances)
NUMA:
  NUMA node(s):           2
  NUMA node0 CPU(s):      0-11,24-35
  NUMA node1 CPU(s):      12-23,36-47
Vulnerabilities:
  Gather data sampling:   Mitigation; Microcode
  Itlb multihit:          Not affected
  L1tf:                   Not affected
  Mds:                    Not affected
  Meltdown:               Not affected
  Mmio stale data:        Mitigation; Clear CPU buffers; SMT vulnerable
  Reg file data sampling: Not affected
  Retbleed:               Not affected
  Spec rstack overflow:   Not affected
  Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:             Mitigation; Enhanced / Automatic IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
  Srbds:                  Not affected
  Tsx async abort:        Not affected
Comment 1 anolislw alibaba_cloud_group 2024-11-12 11:42:23 UTC
存在该问题的用例有
-----------
test_dev_cgroup
test_sock
test_sockmap
get_cgroup_id_user
test_cgroup_storage
test_tcpnotify_user
test_sysctl
test_sock_addr.sh
Comment 2 zhangxinyi alibaba_cloud_group 2024-11-15 17:30:58 UTC
看报错应该是设置cgroup管理cpu的时候失效
echo '+cpu' > /sys/fs/cgroup/cgroup.subtree_control 
-bash: echo: write error: Invalid argument
可以清理其他进程后复测
https://access.redhat.com/solutions/6582021
Comment 3 zhangxinyi alibaba_cloud_group 2024-11-20 16:07:57 UTC
根据https://bugzilla.openanolis.cn/show_bug.cgi?id=11778#c3评论,确认系统只使用了cgroupv2系统并清理其他RR进程以后还是无法写入cpu
环境:
mount | grep cgroup
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)

cat /sys/fs/cgroup/cgroup.controllers 
cpuset cpu io memory hugetlb pids rdma 

ps -T axo pid,ppid,user,group,lwp,nlwp,start_time,comm,cgroup,cls|grep RR 结果为空

测试
echo '+cpu' > /sys/fs/cgroup/cgroup.subtree_control
-bash: echo: write error: Invalid argument
Comment 4 banye97 alibaba_cloud_group 2024-11-21 18:13:59 UTC
优先确认一下在 6.6.25-002.1 版本中是否存在相同问题
Comment 5 anolislw alibaba_cloud_group 2024-11-27 15:17:09 UTC
(In reply to banye97 from comment #4)
> 优先确认一下在 6.6.25-002.1 版本中是否存在相同问题

在 6.6.25-2.1.an23.x86_64 内核环境下,问题存在,涉及到的用例有:
============================
test_dev_cgroup
test_sock
test_sockmap
get_cgroup_id_user
test_cgroup_storage
test_tcpnotify_user
test_sysctl
test_sock_addr.sh
----------------------------

问题详细信息如下
============================
[root@5f9Lab15 selftests]# cd bpf;./test_dev_cgroup
(cgroup_helpers.c:93: errno: Invalid argument) Enabling controller cpu: /mnt/cgroup.subtree_control
Failed to setup cgroup environment
Failed to create test cgroup
[root@5f9Lab15 bpf]#
[root@5f9Lab15 bpf]# ./test_dev_cgroup
(cgroup_helpers.c:93: errno: Invalid argument) Enabling controller cpu: /mnt/cgroup.subtree_control
Failed to setup cgroup environment
Failed to create test cgroup
[root@5f9Lab15 bpf]#
[root@5f9Lab15 bpf]# ./test_sock
(cgroup_helpers.c:93: errno: Invalid argument) Enabling controller cpu: /mnt/cgroup.subtree_control
Failed to setup cgroup environment
[root@5f9Lab15 bpf]#
[root@5f9Lab15 bpf]# ./test_sockmap
(cgroup_helpers.c:93: errno: Invalid argument) Enabling controller cpu: /mnt/cgroup.subtree_control
Failed to setup cgroup environment
[root@5f9Lab15 bpf]#
[root@5f9Lab15 bpf]# ./get_cgroup_id_user
(cgroup_helpers.c:93: errno: Invalid argument) Enabling controller cpu: /mnt/cgroup.subtree_control
Failed to setup cgroup environment
main:FAIL:cgroup_setup_and_join err -22 errno 22
[root@5f9Lab15 bpf]#
[root@5f9Lab15 bpf]# ./test_cgroup_storage
(cgroup_helpers.c:93: errno: Invalid argument) Enabling controller cpu: /mnt/cgroup.subtree_control
Failed to setup cgroup environment
Failed to attach bpf program
[root@5f9Lab15 bpf]#
[root@5f9Lab15 bpf]# ./test_tcpnotify_user
(cgroup_helpers.c:93: errno: Invalid argument) Enabling controller cpu: /mnt/cgroup.subtree_control
Failed to setup cgroup environment
[root@5f9Lab15 bpf]#
[root@5f9Lab15 bpf]# ./test_sysctl
(cgroup_helpers.c:93: errno: Invalid argument) Enabling controller cpu: /mnt/cgroup.subtree_control
Failed to setup cgroup environment
[root@5f9Lab15 bpf]#
[root@5f9Lab15 bpf]# ./test_sock_addr.sh
Wait for testing IPv4/IPv6 to become available .
.. OK
(cgroup_helpers.c:93: errno: Invalid argument) Enabling controller cpu: /mnt/cgroup.subtree_control
Failed to setup cgroup environment
[root@5f9Lab15 bpf]#
[root@5f9Lab15 bpf]# uname -r
6.6.25-2.1.an23.x86_64
[root@5f9Lab15 bpf]#
[root@5f9Lab15 bpf]# pwd
/root/rpmbuild/BUILD/ker
Comment 6 banye97 alibaba_cloud_group 2024-11-27 16:01:07 UTC
非6.6.25-2.2 新增问题,该版本内暂不修复