Description of problem: 虚拟机 cpu hotplug后,虚拟机死机 Version-Release number of selected component (if applicable): kernel版本: 4.19.190-7.7.an8.loongarch64 qemu版本:QEMU emulator version 6.2.0 (qemu-kvm-6.2.0-41.0.1.module+an8.9.0+11168+98c7cfc6.1) Steps to Reproduce: 1. 虚拟机启动,命令如下: MALLOC_PERTURB_=1 /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine loongson7a,memory-backend=mem-machine_mem \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x2 \ -m 2048 \ -object memory-backend-ram,size=2048M,id=mem-machine_mem \ -smp 1,maxcpus=2,cores=2,threads=1,sockets=1 \ -cpu 'Loongson-3A5000' \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \ -blockdev node-name=file_image1,driver=file,filename=/root/avocado/data/avocado-vt/images/AnolisOS-8.9-loongarch64.qcow2 \ -blockdev node-name=drive_image1,driver=qcow2,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1 \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -vnc :0 \ -rtc base=utc,clock=host \ -boot menu=off,order=cdn,once=c,strict=off \ -bios loongarch_bios.bin \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \ -serial stdio \ -monitor telnet:localhost:4444,server,nowait 2. 虚拟机启动后,qemu monitor 进行 cpu hotplug,命令如下: # telnet localhost 4444 Trying ::1... Connected to localhost. Escape character is '^]'. QEMU 6.2.0 monitor - type 'help' for more information (qemu) (qemu) device_add Loongson-3A5000-loongarch-cpu,id=vcpu1,core-id=1 Actual results: [root@anolis-8-guest ~]# [ 41.221035] CPU1 has been hot-added [ 41.224239] Booting CPU#1... [ 8697.930254] 64-bit Loongson Processor probed (LA464 Core) [ 8697.930937] CPU1 revision is: 0014c010 (Loongson-64bit) [ 8697.931505] FPU1 revision is: 00000000 [ 8697.931912] CPU1 __my_cpu_offset: 942b0000 [ 8697.932460] CPU#1 finished [ 8697.933244] pv stealtime: cpu 1, st:0x9000000096000680 phys:0x96000680 [ 8697.934185] Will online and init hotplugged CPU: 1 [ 8697.936965] rcu: INFO: rcu_sched self-detected stall on CPU [ 8697.937770] rcu: 1-...!: (1 ticks this GP) idle=012/0/0x1 softirq=1/1 fqs=0 [ 8697.938719] rcu: (t=6597070 jiffies g=12093 q=141) [ 8697.939389] rcu: rcu_sched kthread starved for 6597070 jiffies! g12093 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0 [ 8697.940804] rcu: RCU grace-period kthread stack dump: [ 8697.941507] rcu_sched I 0 10 2 0x00004000 [ 8697.942266] Stack : 900000009400ec00 0000000000000000 ffffffffffffffff 900000000161d940 [ 8697.943350] 90000000016442f0 0000000000000004 90000000ec537d40 0000000000000000 [ 8697.944444] ffffffffffffffff 900000000161d940 90000000016442f0 0000000000000000 [ 8697.945540] 90000000ec537d40 9000000094004a00 00000000000000b0 00000000ffff03b1 [ 8697.946619] 900000000163602c 900000000112407c 00000000ffff03b1 90000000011289bc [ 8697.947695] 90000000ec537e08 00000000000000b4 0000000000000000 9000000094004c00 [ 8697.948780] 00000000ffff03b1 90000000002cad00 000000000c800000 90000000ec4d7700 [ 8697.949849] 90000000016442f0 0000000000000001 0000000000000001 9000000001636028 [ 8697.950936] 0000000000000005 900000000161d940 0000000000000006 90000000016442f0 [ 8697.952032] 900000000165c280 900000000165a280 900000000163602c 90000000002ba674 [ 8697.953134] ... [ 8697.953492] Call Trace: [ 8697.953868] [<9000000001123830>] __schedule+0x4e0/0xd00 [ 8697.954598] [<9000000001124078>] schedule+0x28/0x80 [ 8697.955253] [<90000000011289b8>] schedule_timeout+0x208/0x520 [ 8697.956062] [<90000000002ba670>] rcu_gp_kthread+0x9a0/0xab0 [ 8697.956838] [<900000000024f1fc>] kthread+0x12c/0x140 [ 8697.957530] [<90000000002031c8>] ret_from_kernel_thread+0x8/0x10 [ 8697.958342] Sending NMI from CPU 1 to CPUs 0: [ 8707.959472] NMI backtrace for cpu 1 [ 8707.960111] CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: G E 4.19.190-7.7.an8.loongarch64 #1 [ 8707.961530] Hardware name: Loongson KVM, BIOS 0.0.0 02/06/2015 [ 8707.962339] Stack : 0000000000000000 900000000111c120 90000000ec544000 90000000ec187ab0 [ 8707.963422] 0000000000000000 90000000ec187ab0 0000000000000000 00000000000000ff [ 8707.964521] 0000000000000000 ffffffffffffffff 0000000000000020 0000000000aaaaaa [ 8707.965627] 900000000111c120 0000000000000007 0000000000000006 0000000000000007 [ 8707.966713] 9000000096000950 0000000000aaaaaa 0000000000000205 0000000000000001 [ 8707.967798] ffff80010d27f020 00000000942b0000 0000000000000001 0000000000000000 [ 8707.968892] 9000000001768230 0000000000000000 0000000000000000 0000000000000000 [ 8707.970004] 900000000112da20 0000000000000240 9000000001636028 9000000001635f40 [ 8707.971094] 900000000020a334 0000000000000000 00000000000000b0 0000000000000004 [ 8707.973256] 0000000000000000 000000000007141c 0000000000000800 9000000001768230 [ 8707.975360] ... [ 8707.976741] Call Trace: [ 8707.978123] [<900000000020a334>] show_stack+0x34/0x140 [ 8707.979843] [<900000000111c11c>] dump_stack+0xac/0xe8 [ 8707.981530] [<90000000010ffe10>] nmi_cpu_backtrace+0xb0/0xf0 [ 8707.983269] [<9000000001100008>] nmi_trigger_cpumask_backtrace+0x1b8/0x1c0 [ 8707.985196] [<90000000011123a4>] rcu_dump_cpu_stacks+0x124/0x190 [ 8707.987008] [<90000000002bc690>] rcu_check_callbacks+0x940/0xa00 [ 8707.988780] [<90000000002cc384>] update_process_times+0x34/0x90 [ 8707.990571] [<90000000002df324>] tick_sched_handle+0x84/0xa0 [ 8707.992305] [<90000000002df7f4>] tick_sched_timer+0x44/0xa0 [ 8707.994013] [<90000000002ccfd4>] __hrtimer_run_queues+0x194/0x400 [ 8707.995755] [<90000000002ce2c0>] hrtimer_interrupt+0x140/0x380 [ 8707.997459] [<9000000000209694>] constant_timer_interrupt+0x34/0x50 [ 8707.999197] [<90000000002a4cf8>] __handle_irq_event_percpu+0x88/0x280 [ 8708.000922] [<90000000002a4f14>] handle_irq_event_percpu+0x24/0x90 [ 8708.002630] [<90000000002aadf0>] handle_percpu_irq+0x60/0xa0 [ 8708.004258] [<90000000002a3878>] generic_handle_irq+0x28/0x50 [ 8708.005889] [<900000000112a6ec>] do_IRQ+0x1c/0x30 [ 8708.007386] [<9000000000203430>] except_vec_vi_handler+0xac/0xdc [ 8708.009075] [<9000000000203380>] __cpu_wait+0x20/0x24 [ 8708.010626] [<9000000000264468>] do_idle+0x258/0x300 [ 8708.012160] [<90000000002646a0>] cpu_startup_entry+0x20/0x30 [ 8708.013804] [<900000000111d148>] smp_bootstrap+0x50/0x58 Expected results: cpu hoyplug 后,成功被添加,自动成功上线,虚拟机正常工作。
Created attachment 1078 [details] 选择4.19内核安装后无法进入系统
Created attachment 1079 [details] 安装的kernel-4.19.91-27.7.an8
使用的镜像anolis-8-x86_64-dvd1-20240202.0.iso
cpu hotplug 死机问题暂时无法复现,此问题场景不影响产品核心功能使用,为保证产品发布,此问题处理优先级降低
使用 kernel-4.19.190-7.9.an8.loongarch64 测试通过。 内核rpm下载地址: https://abs.openanolis.cn/all_project/1?tab=packages&package_id=43928