Bug 8602 - [ANCK-5.10-016.3][rc1][x86_64]intel_pstate03用例fail,预期设置governors模式为powersave后,cpu当前频率和最小频率差值小于100000,实际不满足要求
Summary: [ANCK-5.10-016.3][rc1][x86_64]intel_pstate03用例fail,预期设置governors模式为powersave后...
Status: CLOSED INVALID
Alias: None
Product: ANCK 5.10 Dev
Classification: ANCK
Component: sched (show other bugs) sched
Version: 5.10.y-16.3
Hardware: x86_64 Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: kun(llfl)
QA Contact:
URL:
Whiteboard:
Keywords: Function
Depends on:
Blocks:
 
Reported: 2024-03-21 18:24 UTC by shanxifanshi
Modified: 2024-03-26 17:16 UTC (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description shanxifanshi alibaba_cloud_group 2024-03-21 18:24:20 UTC
[缺陷描述]:
intel_pstate03用例fail,预期设置governors模式为powersave后,cpu当前频率和最小频率差值的绝对值小于100000,实际不满足要求;performance模式存在同样问题


测试日志:
<<<test_start>>>
tag=intel_pstate03 stime=1711015439
cmdline="intel_pstate03.sh"
contacts=""
analysis=exit
<<<test_output>>>
incrementing stop
intel_pstate03 1 TFAIL: freqency verify failed for powersave governor
intel_pstate03 1 TFAIL: freqency verify failed for performance governor

Summary:
passed   0
failed   2
skipped  0
warnings 0
<<<execution_status>>>
initiation_status="ok"
duration=61 termination_type=exited termination_id=1 corefile=no
cutime=8 cstime=5
<<<test_end>>>


复现环境:
anck 5.10 x86 g8i ecs

复现概率:
必现

内核信息:
# uname -r
5.10.134-16.3_rc1.an8.x86_64


操作系统信息:
# cat /etc/os-release
NAME="Anolis OS"
VERSION="8.8"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="8.8"
PLATFORM_ID="platform:an8"
PRETTY_NAME="Anolis OS 8.8"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"

cpu信息:
# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              8
On-line CPU(s) list: 0-7
Thread(s) per core:  2
Core(s) per socket:  4
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
BIOS Vendor ID:      Alibaba Cloud
CPU family:          6
Model:               143
Model name:          Intel(R) Xeon(R) Platinum 8475B
BIOS Model name:     pc-q35-df-2.1
Stepping:            8
CPU MHz:             3200.221
CPU max MHz:         3800.0000
CPU min MHz:         800.0000
BogoMIPS:            5400.00
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           48K
L1i cache:           32K
L2 cache:            2048K
L3 cache:            99840K
NUMA node0 CPU(s):   0-7
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ibrs_enhanced fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx_vnni avx512_bf16 wbnoinvd ida arat hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req avx512vbmi umip pku ospke waitpkg avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid bus_lock_detect cldemote movdiri movdir64b enqcmd fsrm uintr md_clear serialize tsxldtrk amx_bf16 avx512_fp16 amx_tile amx_int8 arch_capabilities

内存信息:
# free -h
              total        used        free      shared  buff/cache   available
Mem:           30Gi       261Mi        29Gi       2.0Mi       992Mi        29Gi
Swap:            0B          0B          0B


内核参数:
# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt3)/boot/vmlinuz-5.10.134-16.3_rc1.an8.x86_64 root=UUID=31fda586-5228-4434-99dc-e779134d4e43 ro cryptomgr.notests rcupdate.rcu_cpu_stall_timeout=300 vring_force_dma_api rhgb quiet biosdevname=0 net.ifnames=0 console=tty0 console=ttyS0,115200n8 noibrs nvme_core.io_timeout=4294967295 nvme_core.admin_timeout=4294967295 cgroup.memory=nokmem crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M processor.max_cstate=1 intel_idle.max_cstate=0

[复现步骤]:
手动简要复现步骤:
std_freq=`cpupower frequency-info | grep 'hardware limits' | awk -F '- ' '{print$2}' | awk -F G '{print $1}'`
std_freq=`echo "scale=0; $std_freq * 1000000 / 1" | bc`
min_freq=`cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq`
max_freq=`cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq`

mode="powersave"
echo 0 > /sys/devices/system/cpu/intel_pstate/no_turbo

#设置governors模式为powersave
cpupower frequency-set -g $mode
cpupower frequency-set -d $min_freq -u $std_freq

#反复查询当前cpu频率30次
curr_freq=`cat /sys/devices/system/cpu/cpu0/cpufreq/*_cur_freq`

[期望结果]:
当前cpu频率curr_freq和最小cpu频率差值的绝对值小于100000

[实际结果]:
用例fail,二者差值有2300000左右,远远大于100000
Comment 1 shanxifanshi alibaba_cloud_group 2024-03-21 18:29:47 UTC
intel_pstate02 intel_pstate04 intel_pstate05用例存在相似的问题
Comment 2 Guanjun alibaba_cloud_group 2024-03-22 15:44:12 UTC
请库恩同学帮忙看一下,谢谢
Comment 3 kun(llfl) alibaba_cloud_group 2024-03-22 17:21:23 UTC
了解到测试机是虚拟机,
1. 跟虚拟化团队沟通后得知:虚拟机没透传相关cpu调频功能,只有裸金属机器支持,因此虚拟机无法调频符合预期;
2. 发现测试用例里面min_freq=`cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq`
max_freq=`cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq`  这两个值可能设置的有点问题,cpuinfo_min_freq和cpuinfo_max_freq含义是硬件支持的最大最小频率,而不是实际可调范围(也就是说实际硬件中这个范围不一定能调得到),scaling_max_freq,scaling_min_freq才是真正硬件支持的可调范围。本用例读取cpuinfo_min_freq和cpuinfo_max_freq,该范围不代表真正可调范围。

因此该问题属误报
Comment 4 yunmeng365524 2024-03-25 10:15:02 UTC
标注一下该用例,后续虚拟机默认不支持该用例。
Comment 5 yunmeng365524 2024-03-25 10:16:25 UTC
非问题关闭。