[缺陷描述]: bcc:test_clang.py失败,Failed to compile BPF module <text> 软件版本: # rpm -qa |grep bcc bcc-tools-0.27.0-1.an23.aarch64 bcc-0.27.0-1.an23.aarch64 python3-bcc-0.27.0-1.an23.noarch bcc-devel-0.27.0-1.an23.aarch64 失败部分日志如下: # python test_clang.py ....../virtual/main.c:3:33: error: incomplete definition of type 'struct request' bpf_trace_printk("%s\n", req->rq_disk->disk_name); ~~~^ include/linux/blkdev.h:32:8: note: forward declaration of 'struct request' struct request; ^ 1 error generated. E........../virtual/main.c:2:1: error: field has incomplete type 'struct key_t' BPF_HASH(drops, struct key_t); ^ /virtual/include/bcc/helpers.h:278:48: note: expanded from macro 'BPF_HASH' BPF_HASHX(__VA_ARGS__, BPF_HASH4, BPF_HASH3, BPF_HASH2, BPF_HASH1)(__VA_ARGS__) ^ /virtual/main.c:2:24: note: forward declaration of 'struct key_t' BPF_HASH(drops, struct key_t); ^ /virtual/main.c:2:1: error: field has incomplete type 'struct key_t' BPF_HASH(drops, struct key_t); ^ /virtual/include/bcc/helpers.h:278:48: note: expanded from macro 'BPF_HASH' BPF_HASHX(__VA_ARGS__, BPF_HASH4, BPF_HASH3, BPF_HASH2, BPF_HASH1)(__VA_ARGS__) ^ /virtual/main.c:2:24: note: forward declaration of 'struct key_t' BPF_HASH(drops, struct key_t); ^ 2 errors generated. ../virtual/main.c:6:12: error: cannot call non-static helper function return bar(); ^ 1 error generated. ./virtual/main.c:13:46: error: incomplete definition of type 'struct request' bpf_trace_printk("traced start %d\n", req->__data_len); ~~~^ include/linux/blkdev.h:32:8: note: forward declaration of 'struct request' struct request; ^ 1 error generated. E/virtual/main.c:12:12: error: incomplete definition of type 'struct request' if (!rq->start_time_ns) ~~^ include/linux/blkdev.h:32:8: note: forward declaration of 'struct request' struct request; ^ /virtual/main.c:15:12: error: incomplete definition of type 'struct request' if (!rq->rq_disk || rq->rq_disk->major != 5 || ~~^ include/linux/blkdev.h:32:8: note: forward declaration of 'struct request' struct request; ^ /virtual/main.c:15:27: error: incomplete definition of type 'struct request' if (!rq->rq_disk || rq->rq_disk->major != 5 || ~~^ include/linux/blkdev.h:32:8: note: forward declaration of 'struct request' struct request; ^================部分省略===================== 1 error generated. .cannot attach kprobe, probe entry may not exist Current kernel does not have __vfs_read, try vfs_read instead ./virtual/main.c:4:14: error: incomplete definition of type 'struct request' if (!(req->bio->bi_flags & 1)) ~~~^ include/linux/blkdev.h:32:8: note: forward declaration of 'struct request' struct request; ^ /virtual/main.c:6:14: error: incomplete definition of type 'struct request' if (((req->bio->bi_flags))) ~~~^ include/linux/blkdev.h:32:8: note: forward declaration of 'struct request' struct request; ^ 2 errors generated. E. ====================================================================== ERROR: test_char_array_probe (__main__.TestClang) ---------------------------------------------------------------------- Traceback (most recent call last): File "/root/bcc/bcc/bcc-0.27.0/tests/python/test_clang.py", line 319, in test_char_array_probe BPF(text=b"""#include <linux/blkdev.h> File "/usr/lib/python3.10/site-packages/bcc/__init__.py", line 479, in __init__ raise Exception("Failed to compile BPF module %s" % (src_file or "<text>")) Exception: Failed to compile BPF module <text> ====================================================================== ERROR: test_iosnoop (__main__.TestClang) ---------------------------------------------------------------------- Traceback (most recent call last): File "/root/bcc/bcc/bcc-0.27.0/tests/python/test_clang.py", line 247, in test_iosnoop b = BPF(text=text, debug=0) File "/usr/lib/python3.10/site-packages/bcc/__init__.py", line 479, in __init__ raise Exception("Failed to compile BPF module %s" % (src_file or "<text>")) Exception: Failed to compile BPF module <text> [复现概率]: 必现 [复现环境]: 内核: # uname -r 6.6.25-2_rc1.an23.x86_64 # cat /etc/os-release NAME="Anolis OS" VERSION="23" ID="anolis" VERSION_ID="23" PLATFORM_ID="platform:an23" PRETTY_NAME="Anolis OS 23" ANSI_COLOR="0;31" HOME_URL="https://openanolis.cn/" BUG_REPORT_URL="https://bugzilla.openanolis.cn/" CPU信息: # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 52 bits physical, 57 bits virtual Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Vendor ID: GenuineIntel BIOS Vendor ID: Alibaba Cloud Model name: Intel(R) Xeon(R) Platinum 8475B BIOS Model name: pc-q35-df-2.1 CPU @ 0.0GHz BIOS CPU family: 1 CPU family: 6 Model: 143 Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 Stepping: 8 CPU(s) scaling MHz: 84% CPU max MHz: 3800.0000 CPU min MHz: 800.0000 BogoMIPS: 5400.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtsc p lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault ibrs_enhanced fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512c d sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx_vnni avx512_bf16 wbnoinvd ida arat hwp hwp_notify hwp_act_window h wp_epp hwp_pkg_req avx512vbmi umip pku ospke waitpkg avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntd q rdpid bus_lock_detect cldemote movdiri movdir64b enqcmd fsrm md_clear serialize tsxldtrk amx_bf16 avx512_fp16 amx_tile amx_int 8 arch_capabilities Virtualization features: Hypervisor vendor: KVM Virtualization type: full Caches (sum of all): L1d: 96 KiB (2 instances) L1i: 64 KiB (2 instances) L2: 4 MiB (2 instances) L3: 97.5 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-3 Vulnerabilities: Gather data sampling: Not affected Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Mmio stale data: Unknown: No mitigations Reg file data sampling: Not affected Retbleed: Not affected Spec rstack overflow: Not affected Spec store bypass: Vulnerable Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; Enhanced / Automatic IBRS, RSB filling, PBRSB-eIBRS SW sequence Srbds: Not affected Tsx async abort: Not affected 内存信息: # free -h total used free shared buff/cache available Mem: 15Gi 585Mi 6.9Gi 364Mi 7.6Gi 13Gi Swap: 0B 0B 0B [复现步骤]: yum install bcc-tools yum-utils rpm-build python3-pyroute2.noarch git clone https://gitee.com/src-anolis-os/bcc.git --branch a23 cd bcc yum-builddep -y bcc.spec rpmbuild -D "_topdir $(pwd)" \ -D "_sourcedir $(pwd)" \ -D "_builddir $(pwd)" \ -bp bcc.spec cd bcc-0.27.0/tests/python python test_clang.py [预期结果]: 用例执行成功 [实际结果]: 用例执行失败
## 报错分析: # 9处报错出现在BPF加载的时候,"Failed to compile BPF module“ test_char_array_probe test_iosnoop test_jump_table test_probe_read3 test_probe_read4 test_probe_read_array_accesses8 test_probe_read_nested_member3 test_probe_read_whitelist1 test_probe_read_whitelist2 # test_unop_probe_read 一处报错和BPF有关 Failed to attach BPF program b'kprobe__finish_task_switch' to kprobe b'finish_task_switch', it's not traceable (either non-existing, inlined, or marked as "notrace" test_task_switch # 一处报错Assertation Error test_sscanf_string ## 排查 并不是所有的BPF初始化出现问题,且简单使用BPF没有问题,判断BPF加载没问题,并不确实相关模块,可能是传入text的原因,需要针对每个case具体排查 简单使用案例: from bcc import BPF BPF(text='int kprobe____x64_sys_clone(void *ctx) { bpf_trace_printk("Hello, World!\\n"); return 0; }').trace_print()
1. test_char_array_probe 失败是因为 struct request 的定义从 linux/blkdev.h 变为 linux/blk-mq.h 且 strcut request 里不再有 rq_disk, 如果需要打印 disk_name 需要重新指定为 req->q->disk->disk_name 2. test_iosnoop 同上,需要include <linux/blk-mq.h> 3. test_jump_table 同上,需要include <linux/blk-mq.h> 4. test_probe_read3, test_probe_read4 运行未见问题 5. test_probe_read_array_accesses8 失败原因是因为 struct mm_strcut 里的 rss_stat 已经变成由 struct mm_rss_stat 变为 struct percpu_counter rss_stat[NR_MM_COUNTERS] 6. test_probe_read_nested_member3 失败是因为返回值类型对不上,需要如下commit https://github.com/iovisor/bcc/commit/9533c2580f907ce2b185014b31524e9f36fc998e 7. test_task_switch 是因为 finish_task_switch 被 GCC10以上版本优化为 finish_task_switch.isra.0,可参考 https://bugzilla.openanolis.cn/show_bug.cgi?id=8861 8. test_unop_probe_read 同123 问题基本是测试用例不匹配
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/3182