Bug 8879 - [Anolis23.1 GA][Beta][ANCK-6.6.25-2] open_posix_testsuite标准符合度测试,pthread_setschedparam_1-1调用pthread_setschedparam函数返回失败,Error at pthread_setschedparam: rc=1
Summary: [Anolis23.1 GA][Beta][ANCK-6.6.25-2] open_posix_testsuite标准符合度测试,pthread_sets...
Status: CLOSED DUPLICATE of bug 2664
Alias: None
Product: ANCK 6.6 Dev
Classification: ANCK
Component: sched (show other bugs) sched
Version: 6.6.25-2
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: dtcccc
QA Contact: CruzZhao
URL:
Whiteboard:
Keywords: Function
Depends on:
Blocks:
 
Reported: 2024-04-24 14:43 UTC by yunhe123
Modified: 2024-05-20 19:35 UTC (History)
9 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description yunhe123 alibaba_cloud_group 2024-04-24 14:43:41 UTC
[缺陷描述]:
[Anolis23.1-GA][Beta][ANCK-6.6.25-2] open_posix_testsuite标准符合度测试,pthread_setschedparam_1-1调用pthread_setschedparam函数返回失败,Error at pthread_setschedparam: rc=1失败,日志如下:

# pwd
/tmp/tone/run/open_posix_testsuite/conformance/interfaces/pthread_setschedparam
# ./run.sh
conformance/interfaces/pthread_setschedparam/pthread_setschedparam_1-1: execution: FAILED
conformance/interfaces/pthread_setschedparam/pthread_setschedparam_4-1: execution: UNRESOLVED
conformance/interfaces/pthread_setschedparam/pthread_setschedparam_1-2: execution: UNRESOLVED
*******************
Testing pthread_setschedparam
*******************
PASS              1
FAIL              3
*******************
TOTAL             4
******************


# ./pthread_setschedparam_1-1.run-test
Error at pthread_setschedparam: rc=1


github上有相关bug,https://github.com/coreos/bugs/issues/410,通过设置sysctl -w kernel.sched_rt_runtime_us=-1后,该用例执行pass,具体如下:
# cat /proc/sys/kernel/sched_rt_runtime_us
950000

# sysctl -w kernel.sched_rt_runtime_us=-1
kernel.sched_rt_runtime_us = -1

# ./pthread_setschedparam_1-1.run-test
Test PASSED


[重现概率]:
必现

[重现环境]:
环境信息:ecs

内核信息:
# uname -r
6.6.25-2_rc1.an23.aarch64

版本信息:
# cat /etc/os-release
NAME="Anolis OS"
VERSION="23"
ID="anolis"
VERSION_ID="23"
PLATFORM_ID="platform:an23"
PRETTY_NAME="Anolis OS 23"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"
BUG_REPORT_URL="https://bugzilla.openanolis.cn/"

内存信息:
# free -h
               total        used        free      shared  buff/cache   available
Mem:            30Gi       508Mi        27Gi       1.2Gi       2.2Gi        28Gi
Swap:             0B          0B          0B

cpu信息:
# lscpu
Architecture:             aarch64
  CPU op-mode(s):         32-bit, 64-bit
  Byte Order:             Little Endian
CPU(s):                   8
  On-line CPU(s) list:    0-7
Vendor ID:                ARM
  BIOS Vendor ID:         Alibaba Cloud
  Model name:             Neoverse-N2
    BIOS Model name:      virt-rhel7.6.0  CPU @ 2.0GHz
    BIOS CPU family:      1
    Model:                0
    Thread(s) per core:   1
    Core(s) per socket:   8
    Socket(s):            1
    Stepping:             r0p0
    BogoMIPS:             100.00
    Flags:                fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512
                          sve asimdfhm dit uscat ilrcpc flagm ssbs sb dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8m
                          m bf16 dgh
Caches (sum of all):
  L1d:                    512 KiB (8 instances)
  L1i:                    512 KiB (8 instances)
  L2:                     8 MiB (8 instances)
  L3:                     64 MiB (1 instance)
NUMA:
  NUMA node(s):           1
  NUMA node0 CPU(s):      0-7
Vulnerabilities:
  Gather data sampling:   Not affected
  Itlb multihit:          Not affected
  L1tf:                   Not affected
  Mds:                    Not affected
  Meltdown:               Not affected
  Mmio stale data:        Not affected
  Reg file data sampling: Not affected
  Retbleed:               Not affected
  Spec rstack overflow:   Not affected
  Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:             Mitigation; __user pointer sanitization
  Spectre v2:             Mitigation; CSV2, BHB
  Srbds:                  Not affected
  Tsx async abort:        Not affected


[重现步骤]:
1、git clone https://github.com/linux-test-project/ltp.git下载社区最新版本ltp
2、cd ltp/testcases/open_posix_testsuite 
3、make all
4、执行相关用例
cd /tmp/tone/run/open_posix_testsuite/conformance/interfaces/pthread_setschedparam;
# ./pthread_setschedparam_1-1.run-test


[期望结果]:
用例执行pass

[实际结果]:
用例执行fail
Comment 1 yunmeng365524 2024-05-07 21:21:32 UTC
请确认kernel.sched_rt_runtime_us 的默认值要不要改。
Comment 2 dtcccc alibaba_cloud_group 2024-05-08 10:48:03 UTC
符合预期。测试用例失败是因为:
所在cgroup没有分配rt带宽额度

所以,通过设置 kernel.sched_rt_runtime_us=-1 可以取消全局的rt带宽限制,没有限制自然就可以设置了

另外,通过将测试任务加入cgroup根组(默认分配了950000/1000000的带宽)也可以使测试通过

cgroup v1里可以给子组分别设置带宽,但是v2不支持了。所以v2只能通过上述两种方式来通过测试。
更具体的结论可以参考 bugzilla 2664

*** This bug has been marked as a duplicate of bug 2664 ***
Comment 3 dtcccc alibaba_cloud_group 2024-05-11 10:43:36 UTC
相比起修改kernel.sched_rt_runtime_us这个全局配置,更推荐将当前任务加入cgroup的根组。

可以看下在anolis8/alinux3的时候是怎么跑这些测试用例的,因为我理解在这些cgroup v1环境下想在cgroup子组里跑起来仍然需要配置rt带宽,但是测试一般不会这么做。
我看许多以前的测试用例应该是跑之前先将自身加入cpu子系统的根组?比如
echo $$ > /sys/fs/cgroup/cpu/tasks

那么在cgroup v2环境下可以对应配置成
echo $$ > /sys/fs/cgroup/cgroup.procs
Comment 4 yunhe123 alibaba_cloud_group 2024-05-13 18:31:10 UTC
按照echo $$ > /sys/fs/cgroup/cgroup.procs设置后,用例执行pass:
[root@iZbp143ti4ccpaufkzata1Z open_posix_testsuite]# echo $$ > /sys/fs/cgroup/cgroup.procs
[root@iZbp143ti4ccpaufkzata1Z open_posix_testsuite]# ./conformance/interfaces/pthread_setschedparam/pthread_setschedparam_1-1.run-test
Test PASSED