Bug 8861 - [Anolis23.1 GA][Beta][ANCK-6.6.25-2][aarch64/x86_64]bcc:test_histogram.py用例失败,报错Failed to attach BPF program b'kprobe__finish_task_switch' to kprobe b'finish_task_switch'
Summary: [Anolis23.1 GA][Beta][ANCK-6.6.25-2][aarch64/x86_64]bcc:test_histogram.py用例失败...
Status: CLOSED WONTFIX
Alias: None
Product: Anolis OS 23
Classification: Anolis OS
Component: ---> ToBeTriaged (show other bugs) ---> ToBeTriaged
Version: 23.1
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: beta
Assignee: gaochang
QA Contact:
URL:
Whiteboard:
Keywords: Function
Depends on:
Blocks:
 
Reported: 2024-04-23 17:40 UTC by zhixin01
Modified: 2025-02-25 18:01 UTC (History)
8 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description zhixin01 alibaba_cloud_group 2024-04-23 17:40:06 UTC
[缺陷描述]:
bcc:test_histogram.py用例失败,报错Failed to attach BPF program b'kprobe__finish_task_switch' to kprobe b'finish_task_switch'

软件版本:
# rpm -qa |grep bcc
bcc-tools-0.27.0-1.an23.aarch64
bcc-0.27.0-1.an23.aarch64
python3-bcc-0.27.0-1.an23.noarch
bcc-devel-0.27.0-1.an23.aarch64

失败日志如下:
# python test_histogram.py
cannot attach kprobe, probe entry may not exist
E
k_1 & k_2 =   0 0
     size                : count     distribution
       512 -> 1023       : 2        |*************                           |
      1024 -> 2047       : 0        |                                        |
      2048 -> 4095       : 0        |                                        |
      4096 -> 8191       : 6        |****************************************|

k_1 & k_2 =  16 0
     size                : count     distribution
         8 -> 15         : 1        |********************                    |
        16 -> 31         : 2        |****************************************|

k_1 & k_2 =  32 0
     size                : count     distribution
       256 -> 511        : 1        |****************************************|

k_1 & k_2 =  48 0
     size                : count     distribution
       256 -> 511        : 2        |***                                     |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 0        |                                        |
      2048 -> 4095       : 0        |                                        |
      4096 -> 8191       : 2        |***                                     |
      8192 -> 16383      : 0        |                                        |
     16384 -> 32767      : 21       |****************************************|

k_1 & k_2 =  48 1
     size                : count     distribution
       128 -> 255        : 1        |****************************************|

k_1 & k_2 =  64 0
     size                : count     distribution
      4096 -> 8191       : 2        |****************************************|

k_1 & k_2 = 112 0
     size                : count     distribution
       512 -> 1023       : 4        |********************************        |
      1024 -> 2047       : 0        |                                        |
      2048 -> 4095       : 5        |****************************************|
...
======================================================================
ERROR: test_chars (__main__.TestHistogram)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcc/bcc/bcc-0.27.0/tests/python/test_histogram.py", line 61, in test_chars
    b = BPF(text=b"""
  File "/usr/lib/python3.10/site-packages/bcc/__init__.py", line 487, in __init__
    self._trace_autoload()
  File "/usr/lib/python3.10/site-packages/bcc/__init__.py", line 1456, in _trace_autoload
    self.attach_kprobe(
  File "/usr/lib/python3.10/site-packages/bcc/__init__.py", line 845, in attach_kprobe
    raise Exception("Failed to attach BPF program %s to kprobe %s"
Exception: Failed to attach BPF program b'kprobe__finish_task_switch' to kprobe b'finish_task_switch', it's not traceable (either non-existing, inlined, or marked as "notrace")

----------------------------------------------------------------------
Ran 4 tests in 3.021s

FAILED (errors=1)

[复现概率]:
必现

[复现环境]:
内核:
# uname -r
6.6.25-2_rc1.an23.x86_64

# cat /etc/os-release
NAME="Anolis OS"
VERSION="23"
ID="anolis"
VERSION_ID="23"
PLATFORM_ID="platform:an23"
PRETTY_NAME="Anolis OS 23"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"
BUG_REPORT_URL="https://bugzilla.openanolis.cn/"

CPU信息:
# lscpu
Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          52 bits physical, 57 bits virtual
  Byte Order:             Little Endian
CPU(s):                   4
  On-line CPU(s) list:    0-3
Vendor ID:                GenuineIntel
  BIOS Vendor ID:         Alibaba Cloud
  Model name:             Intel(R) Xeon(R) Platinum 8475B
    BIOS Model name:      pc-q35-df-2.1  CPU @ 0.0GHz
    BIOS CPU family:      1
    CPU family:           6
    Model:                143
    Thread(s) per core:   2
    Core(s) per socket:   2
    Socket(s):            1
    Stepping:             8
    CPU(s) scaling MHz:   84%
    CPU max MHz:          3800.0000
    CPU min MHz:          800.0000
    BogoMIPS:             5400.00
    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtsc
                          p lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pdcm
                          pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault ibrs_enhanced
                          fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512c
                          d sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx_vnni avx512_bf16 wbnoinvd ida arat hwp hwp_notify hwp_act_window h
                          wp_epp hwp_pkg_req avx512vbmi umip pku ospke waitpkg avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntd
                          q rdpid bus_lock_detect cldemote movdiri movdir64b enqcmd fsrm md_clear serialize tsxldtrk amx_bf16 avx512_fp16 amx_tile amx_int
                          8 arch_capabilities
Virtualization features:
  Hypervisor vendor:      KVM
  Virtualization type:    full
Caches (sum of all):
  L1d:                    96 KiB (2 instances)
  L1i:                    64 KiB (2 instances)
  L2:                     4 MiB (2 instances)
  L3:                     97.5 MiB (1 instance)
NUMA:
  NUMA node(s):           1
  NUMA node0 CPU(s):      0-3
Vulnerabilities:
  Gather data sampling:   Not affected
  Itlb multihit:          Not affected
  L1tf:                   Not affected
  Mds:                    Not affected
  Meltdown:               Not affected
  Mmio stale data:        Unknown: No mitigations
  Reg file data sampling: Not affected
  Retbleed:               Not affected
  Spec rstack overflow:   Not affected
  Spec store bypass:      Vulnerable
  Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:             Mitigation; Enhanced / Automatic IBRS, RSB filling, PBRSB-eIBRS SW sequence
  Srbds:                  Not affected
  Tsx async abort:        Not affected

内存信息:
# free -h
               total        used        free      shared  buff/cache   available
Mem:            15Gi       585Mi       6.9Gi       364Mi       7.6Gi        13Gi
Swap:             0B          0B          0B


[复现步骤]:
yum install bcc-tools yum-utils rpm-build python3-pyroute2.noarch
git clone https://gitee.com/src-anolis-os/bcc.git --branch a23
cd bcc
yum-builddep -y bcc.spec
rpmbuild -D "_topdir $(pwd)" \
        -D "_sourcedir $(pwd)" \
        -D "_builddir $(pwd)" \
        -bp bcc.spec
cd bcc-0.27.0/tests/python
python test_clang.py

[预期结果]:
用例执行成功

[实际结果]:
用例执行失败
Comment 1 feitian200603 alibaba_cloud_group 2024-04-28 20:10:17 UTC
cat /sys/kernel/debug/tracing/available_filter_functions | grep finish_task_switch
finish_task_switch.isra.0

可用trace的函数做了修改,请确认该修改是否为版本规划要求,如果函数修改对应的kernel selftest的用例需要适配
Comment 2 xiangzao alibaba_cloud_group 2024-05-07 19:17:25 UTC
finish_task_switch 会被GCC10以上版本优化为 finish_task_switch.isra.0,其中 isra 是 inter-procedural scalar replacement of aggregates 的缩写。
如果用户想trace finish_task_switch 函数,有多种方法:

1. 让 finish_task_switch 不被 GCC 优化,方法有删掉 finish_task_switch 前面的 static 或者修改为 asmlinkage __visible

2. 直接trace finish_task_switch.isra.0,这个在bcc另外的example里有实现,比如./bcc/examples/tracing/task_switch.py 里 
b = BPF(src_file="task_switch.c")
b.attach_kprobe(event_re="^finish_task_switch$|^finish_task_switch\.isra\.\d$",
                fn_name="count_sched")
实测可以成功 trace

本例 test_histogram.py 里的实现: int kprobe__finish_task_switch 无法修改为 finish_task_switch.isra.0。
本bugzilla所示只是个例子,用户可以自行通过 trace finish_task_switch.isra.0 的方法实现功能,非内核问题,上游bcc本例无修改,置为 wont fix
Comment 3 zhixin01 alibaba_cloud_group 2024-05-22 10:47:34 UTC
如上述开发定位所说,非内核问题,本问题单关闭。
Comment 4 zhixin01 alibaba_cloud_group 2025-02-25 18:01:48 UTC
在6.6.71-3_rc1.al8.aarch64内核上也有同样的问题,在此做个纪录