Bug 4252 - [Anck 5.10 nightly/ANCK-5.10-14-rc1][Anolis8][x86_64] perf-sanity-tests下“PMU event map aliases”用例fail,PMU events subtest 2: FAILED!
Summary: [Anck 5.10 nightly/ANCK-5.10-14-rc1][Anolis8][x86_64] perf-sanity-tests下“PMU ...
Status: CLOSED FIXED
Alias: None
Product: Antest
Classification: Infrastructures
Component: 测试用例 (show other bugs) 测试用例
Version: unspecified
Hardware: x86_64 Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: zhangjing
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-28 13:34 UTC by shanxifanshi
Modified: 2023-07-25 15:17 UTC (History)
9 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description shanxifanshi alibaba_cloud_group 2023-02-28 13:34:56 UTC
[缺陷描述]:
perf-sanity-tests下“PMU event map aliases”用例fail,PMU events subtest 2: FAILED!

测试日志:

# perf test -v "PMU event map aliases"
10: PMU events                                                      :
10.2: PMU event map aliases                                         :
--- start ---
test child forked, pid 1057898
Using CPUID GenuineIntel-6-4F-1
testing PMU cpu aliases: failed
test child finished with -1
---- end ----
PMU events subtest 2: FAILED!


[环境信息]:
perf版本:
# perf -v
perf version 5.10.134-587.git.04d8c8489.an8.x86_64

内核信息:
# uname -r
5.10.134-587.git.04d8c8489.an8.x86_64

操作系统信息:
# cat /etc/os-release
NAME="Anolis OS"
VERSION="8.8"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="8.8"
PLATFORM_ID="platform:an8"
PRETTY_NAME="Anolis OS 8.8"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"

cpu信息:
# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              24
On-line CPU(s) list: 0-23
Thread(s) per core:  2
Core(s) per socket:  12
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
BIOS Vendor ID:      Intel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
BIOS Model name:     Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Stepping:            2
CPU MHz:             2494.873
CPU max MHz:         2500.0000
CPU min MHz:         1200.0000
BogoMIPS:            4988.85
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            30720K
NUMA node0 CPU(s):   0-23
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts md_clear flush_l1d


内存信息:
# free -h
              total        used        free      shared  buff/cache   available
Mem:           62Gi       1.4Gi        57Gi       137Mi       3.3Gi        60Gi
Swap:         2.0Gi        36Mi       2.0Gi

[期望结果]:
用例pass

[实际结果]:
用例fail

[问题发生概率]:必现

[复现步骤]:	
1. 安装跟当前内核匹配的最新的perf和python3-perf包
2. perf test -v "PMU event map aliases"

[原因分析]:
1. 该用例首次fail是在2月24号晚的nightly出现,观察了几天之后一直fail
2. 近期开发提交了多个与perf相关的commit,可能是近期修改引入
Comment 1 shanxifanshi alibaba_cloud_group 2023-02-28 13:38:28 UTC
把perf切换回23号的版本,该用例是pass的,切换回24号的perf版本,该用例会fail;可能是24号开发提交的多个perf相关commit引入的问题

24号晚上编译的perf验证结果:

# perf -v
perf version 5.10.134-584.git.8e4c8a3e2.an8.x86_64

# perf test -v "PMU event map aliases"
10: PMU events                                                      :
10.2: PMU event map aliases                                         :
--- start ---
test child forked, pid 1057898
Using CPUID GenuineIntel-6-4F-1
testing PMU cpu aliases: failed
test child finished with -1
---- end ----
PMU events subtest 2: FAILED!


23号晚上perf验证结果:

# perf -v
perf version 5.10.134-583.git.ead5e8d2a.an8.x86_64

# perf test -v "PMU event table sanity"
10: PMU events                                                      :
10.1: PMU event table sanity                                        :
--- start ---
test child forked, pid 1057383
testing event table uncore_hisi_ddrc.flux_wcmd: pass
testing event table unc_cbo_xsnp_response.miss_eviction: pass
testing event table bp_l1_btb_correct: pass
testing event table bp_l2_btb_correct: pass
testing event table l3_cache_rd: pass
testing event table segment_reg_loads.any: pass
testing event table dispatch_blocked.any: pass
testing event table eist_trans: pass
test child finished with 0
---- end ----
PMU events subtest 1: Ok
Comment 2 shanxifanshi alibaba_cloud_group 2023-02-28 14:24:16 UTC
另外这个用例只在物理机上fail,ecs上是pass的
Comment 3 shanxifanshi alibaba_cloud_group 2023-03-02 10:34:16 UTC
这个问题在an8 5.10.134-14_rc1内核也是存在的。

# uname -r
5.10.134-14_rc1.an8.x86_64

测试日志:
10.2: PMU event map aliases                                         :
--- start ---
test child forked, pid 8775
Using CPUID GenuineIntel-6-55-4
testing PMU cpu aliases: failed
test child finished with -1
---- end ----
PMU events subtest 2: FAILED!
Comment 4 yunmeng365524 2023-03-04 21:37:06 UTC
请帮忙确认是不是跟下面的bug是相同的问题。
这个bug有个奇怪的点是,只有物理机会failed。
https://bugzilla.openanolis.cn/show_bug.cgi?id=4247
Comment 5 yunhe123 alibaba_cloud_group 2023-03-06 11:10:16 UTC
(In reply to yunmeng365524 from comment #4)
> 请帮忙确认是不是跟下面的bug是相同的问题。
> 这个bug有个奇怪的点是,只有物理机会failed。
> https://bugzilla.openanolis.cn/show_bug.cgi?id=4247

anolis8-5.10-arm和x86现象有点不同,物理机和倚天ecs是pass的,社区nightly的ecs上是failed,bug单:https://bugzilla.openanolis.cn/show_bug.cgi?id=4254
Comment 6 zhangjing alibaba_cloud_group 2023-03-10 11:49:21 UTC
已修复,PR: https://e.gitee.com/openanolis/repos/anolis/cloud-kernel/pulls/1392
Comment 7 zhangjing alibaba_cloud_group 2023-03-10 12:30:56 UTC
已修复
Comment 8 shanxifanshi alibaba_cloud_group 2023-03-10 13:55:58 UTC
最新的nightly验证通过,问题解决,bug关闭

# perf -v
perf version 5.10.134-597.git.78c79f183.an8.x86_64

# uname -r
5.10.134-597.git.78c79f183.an8.x86_64

# perf test 'PMU event map aliases'
10: PMU events                                                      :
10.2: PMU event map aliases                                         : Ok