Bug 3634 - commit 'dc970b94de760 anolis: kvm/pmu: support kvm compiled for module' 导致Unixbench性能大幅下降且改变了代码原来运行的路径
Summary: commit 'dc970b94de760 anolis: kvm/pmu: support kvm compiled for module' 导致Uni...
Status: NEW
Alias: None
Product: ANCK 4.19 Dev
Classification: ANCK
Component: virt (show other bugs) virt
Version: unspecified
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: fghui_kernel
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-01-05 18:04 UTC by 苟浩
Modified: 2023-01-05 18:35 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description 苟浩 uniontech_group 2023-01-05 18:04:55 UTC
commit 'dc970b94de760 anolis: kvm/pmu: support kvm compiled for module' 这个补丁有2个问题:
1. 导致Unixbench性能大幅下降;
2. 改变以了代码原来的运行路径。

1. 性能下降
我在我们的内核合入这个补丁之后,在鲲鹏arm64上跑Unixbench,发现context1这个用例跑分大幅下降。

命令:./Run -c 64 context1

    With this patch:

    System Benchmarks Partial Index              BASELINE       RESULT    INDEX
    Pipe-based Context Switching                   4000.0    3535014.2   8837.5
                                                                       ========
    System Benchmarks Index Score (Partial Only)                         8837.5

    -------------------

    Without this patch:

    System Benchmarks Partial Index              BASELINE       RESULT    INDEX
    Pipe-based Context Switching                   4000.0   10431845.8  26079.6
                                                                       ========
    System Benchmarks Index Score (Partial Only)                        26079.6

2. 改变原来代码运行路径
perf在init_subsystems被调用之前,会先调用 armv8pmu_reset,调用栈如下:

[    1.043061] Call trace:
[    1.043065]  dump_backtrace+0x0/0x198
[    1.043066]  show_stack+0x1c/0x28
[    1.043069]  dump_stack+0xa8/0xf0
[    1.043071]  armv8pmu_reset+0x6c/0xa8
[    1.043073]  arm_perf_starting_cpu+0x58/0xf0
[    1.043075]  cpuhp_invoke_callback+0x104/0x658
[    1.043076]  cpuhp_thread_fun+0x94/0x150
[    1.043079]  smpboot_thread_fn+0x158/0x1d8
[    1.043081]  kthread+0x130/0x138
[    1.043622] gh: 3: 0
[    1.043648] hw perfevents: enabled with armv8_pmuv3_0 PMU driver, 13 counters available
[    1.043665] kvm [1]: Hisi ncsnp: enabled
[    1.043770] kvm [1]: 16-bit VMID
[    1.043771] kvm [1]: IPA Size Limit: 48bits
[    1.043794] kvm [1]: GICv4 support disabled
[    1.043795] kvm [1]: vgic-v2@9b020000
[    1.043809] kvm [1]: GIC system register CPU interface enabled
[    1.044175] kvm [1]: vgic interrupt IRQ1
[    1.044552] kvm [1]: VHE mode initialized successfully

在没合入 'dc970b94de760'时代码会调用armv8pmu_reset()->kvm_clr_pmu_events(U32_MAX);

但是'dc970b94de760'将armv8pmu_reset修改成了
-	kvm_clr_pmu_events(U32_MAX);
+	if (kvm_clr_pmu_events_ptr)
+		(*kvm_clr_pmu_events_ptr)(U32_MAX);

在上面的arm_perf_starting_cpu这个调用栈里,因为此时还没有调用register_kvm_pmu_events_handler,所以kvm_clr_pmu_events_ptr是NULL(上面dmesg里的'[    1.043622] gh: 3: 0'就是我打印的当时kvm_clr_pmu_events_ptr的值为0),导致不会调用kvm_clr_pmu_events。

我不知道这个原因是不是导致问题1里性能下降的原因,请社区帮忙分析一下,谢谢!