Bug 8853 - [Anolis23.1 GA][Beta][ANCK-6.6.25-2][x86_64] kernel-selftests测试x86/lam_64执行异常
Summary: [Anolis23.1 GA][Beta][ANCK-6.6.25-2][x86_64] kernel-selftests测试x86/lam_64执行异常
Status: CLOSED FIXED
Alias: None
Product: ANCK 6.6 Dev
Classification: ANCK
Component: generic (show other bugs) generic
Version: 6.6.25-2
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: zelin
QA Contact:
URL:
Whiteboard:
Keywords: Function
Depends on:
Blocks:
 
Reported: 2024-04-23 16:37 UTC by anolislw
Modified: 2024-05-20 15:12 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description anolislw alibaba_cloud_group 2024-04-23 16:37:15 UTC
[缺陷描述]:
kernel-selftests测试x86/lam_64执行异常,Unsupported LAM feature! 需要帮忙确定这个case测试是否有问题。


[重现概率]:
必现


[重现步骤]
1. 下载kernel-6.6.25-2_rc1.an23.src.rpm
2. rpm -i kernel-6.6.25-2_rc1.an23.src.rpm
3. yum-builddep -y /root/rpmbuild/SPECS/kernel.spec   
   rpmbuild -bp /root/rpmbuild/SPECS/kernel.spec
   cd /root/rpmbuild/BUILD/kernel-6.6.25-2_rc1.an23/linux-6.6.25-2_rc1.an23.x86_64/tools/testing/selftests/x86
4. make;./lam_64


[期望结果]:
用例执行PASS


[实际结果]:
[root@iZbp1c9jzchxjqive233ugZ x86]# ./lam_64
# Unsupported LAM feature!


[重现环境]:
环境信息:云上ecs

Last login: Tue Apr 23 15:16:32 2024 from 59.82.30.41
[root@iZbp1c9jzchxjqive233ugZ ~]# uname -r
6.6.25-2_rc1.an23.x86_64
[root@iZbp1c9jzchxjqive233ugZ ~]# cat /etc/os-release
NAME="Anolis OS"
VERSION="23"
ID="anolis"
VERSION_ID="23"
PLATFORM_ID="platform:an23"
PRETTY_NAME="Anolis OS 23"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"
BUG_REPORT_URL="https://bugzilla.openanolis.cn/"

[root@iZbp1c9jzchxjqive233ugZ ~]# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/boot/vmlinuz-6.6.25-2_rc1.an23.x86_64 root=UUID=06ce37cb-4731-4a37-a95d-1f756b7eee30 ro rhgb crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M cryptomgr.notests cgroup.memory=nokmem rcupdate.rcu_cpu_stall_timeout=300 quiet biosdevname=0 net.ifnames=0 console=tty0 console=ttyS0,115200n8 noibrs nvme_core.io_timeout=4294967295 nvme_core.admin_timeout=4294967295
[root@iZbp1c9jzchxjqive233ugZ ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        4.0M     0  4.0M   0% /dev
tmpfs           7.6G     0  7.6G   0% /dev/shm
tmpfs           3.1G  560K  3.1G   1% /run
/dev/nvme0n1p2   40G   14G   27G  33% /
tmpfs           7.6G     0  7.6G   0% /tmp
tmpfs           1.6G  4.0K  1.6G   1% /run/user/0
[root@iZbp1c9jzchxjqive233ugZ ~]#
[root@iZbp1c9jzchxjqive233ugZ ~]# free -g
               total        used        free      shared  buff/cache   available
Mem:              15           0          14           0           0          14
Swap:              0           0           0
[root@iZbp1c9jzchxjqive233ugZ ~]#
[root@iZbp1c9jzchxjqive233ugZ ~]# lscpu
Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          52 bits physical, 57 bits virtual
  Byte Order:             Little Endian
CPU(s):                   4
  On-line CPU(s) list:    0-3
Vendor ID:                GenuineIntel
  BIOS Vendor ID:         Alibaba Cloud
  Model name:             Intel(R) Xeon(R) Platinum 8475B
    BIOS Model name:      pc-q35-df-2.1  CPU @ 0.0GHz
    BIOS CPU family:      1
    CPU family:           6
    Model:                143
    Thread(s) per core:   2
    Core(s) per socket:   2
    Socket(s):            1
    Stepping:             8
    CPU(s) scaling MHz:   83%
    CPU max MHz:          3800.0000
    CPU min MHz:          800.0000
    BogoMIPS:             5400.00
    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse
                           sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cp
                          uid aperfmperf tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x
                          2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_faul
                          t ibrs_enhanced fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512d
                          q rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsav
                          ec xgetbv1 xsaves avx_vnni avx512_bf16 wbnoinvd ida arat hwp hwp_notify hwp_act_window hwp_e
                          pp hwp_pkg_req avx512vbmi umip pku ospke waitpkg avx512_vbmi2 gfni vaes vpclmulqdq avx512_vn
                          ni avx512_bitalg avx512_vpopcntdq rdpid bus_lock_detect cldemote movdiri movdir64b enqcmd fs
                          rm md_clear serialize tsxldtrk amx_bf16 avx512_fp16 amx_tile amx_int8 arch_capabilities
Virtualization features:
  Hypervisor vendor:      KVM
  Virtualization type:    full
Caches (sum of all):
  L1d:                    96 KiB (2 instances)
  L1i:                    64 KiB (2 instances)
  L2:                     4 MiB (2 instances)
  L3:                     97.5 MiB (1 instance)
NUMA:
  NUMA node(s):           1
  NUMA node0 CPU(s):      0-3
Vulnerabilities:
  Gather data sampling:   Not affected
  Itlb multihit:          Not affected
  L1tf:                   Not affected
  Mds:                    Not affected
  Meltdown:               Not affected
  Mmio stale data:        Unknown: No mitigations
  Reg file data sampling: Not affected
  Retbleed:               Not affected
  Spec rstack overflow:   Not affected
  Spec store bypass:      Vulnerable
  Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:             Mitigation; Enhanced / Automatic IBRS, RSB filling, PBRSB-eIBRS SW sequence
  Srbds:                  Not affected
  Tsx async abort:        Not affected
Comment 1 chenzhuo alibaba_cloud_group 2024-04-25 15:01:40 UTC
6.6内核新增的用例,看起来是cpu相关的问题
Comment 2 zhangjing alibaba_cloud_group 2024-04-25 15:13:10 UTC
cpu不支持LAM特性的时候,应该SKIP掉这个测试,相关upstream补丁:https://lore.kernel.org/lkml/ZgfyxD15dg9tLzyT@gmail.com/t/
Comment 3 zelin alibaba_cloud_group 2024-04-30 14:40:44 UTC
我们会覆盖这个测试用例的修复
Comment 4 小龙 admin 2024-05-06 16:45:52 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/3125
Comment 5 zelin alibaba_cloud_group 2024-05-07 13:42:52 UTC
see pr, if cpu does not support lam or xave related features, relevant test cases will be skipped instead of failing
Comment 6 anolislw alibaba_cloud_group 2024-05-11 14:26:15 UTC
使用rc2内核,在x86 ECS环境下回归验证,问题解决 ,该用例走了 skip 流程,部分记录日志如下
=========================================
# [OK]  Load tiledata succeeded.
# [RUN] Check tile data inheritance.
#       Before fork(), load tiledata
# [RUN] Check tiledata context switches, 10 iterations, 5 threads.
# [OK]  No incorrect case was found.
#       Read the init'ed tiledata via ptrace().
# [OK]  The init'ed tiledata was read from ptracee.
#       Inject tiledata via ptrace().
# [OK]  Tiledata was correctly written to ptracee.
ok 18 selftests: x86: amx_64
# timeout set to 45
# selftests: x86: lam_64
# # Unsupported LAM feature!
ok 19 selftests: x86: lam_64 # SKIP
# timeout set to 45
# selftests: x86: test_shadow_stack_64
# [SKIP]        Could not enable Shadow stack
not ok 20 selftests: x86: test_shadow_stack_64 # exit=1
make[1]: Leaving directory '/root/rpmbuild/BUILD/kernel-6.6.25-2_rc2.an23/linux-6.6.25-2_rc2.an23.x86_64/tools/testing/selftests/x86'
make: Leaving directory '/root/rpmbuild/BUILD/kernel-6.6.25-2_rc2.an23/linux-6.6.25-2_rc2.an23.x86_64/tools/testing/selftests'
[root@iZbp1c9jzchxjqive233ugZ linux-6.6.25-2_rc2.an23.x86_64]# echo $?
0
[root@iZbp1c9jzchxjqive233ugZ linux-6.6.25-2_rc2.an23.x86_64]# make -C tools/testing/selftests/ TARGETS=x86 run_tests^C

[root@iZbp1c9jzchxjqive233ugZ linux-6.6.25-2_rc2.an23.x86_64]# cat a | grep lam_64
# selftests: x86: lam_64
ok 19 selftests: x86: lam_64 # SKIP

[root@iZbp1c9jzchxjqive233ugZ linux-6.6.25-2_rc2.an23.x86_64]# pwd
/root/rpmbuild/BUILD/kernel-6.6.25-2_rc2.an23/linux-6.6.25-2_rc2.an23.x86_64
[root@iZbp1c9jzchxjqive233ugZ linux-6.6.25-2_rc2.an23.x86_64]# uname -r
6.6.25-2_rc2.an23.x86_64
Comment 7 yunmeng365524 2024-05-20 15:12:45 UTC
问题解决关闭。