Bug 7950 - [Anolis23.1][RC1][loongarch64]虚拟机 cpu hotplug后,新增的cpu处于off line状态,手动使能才能online
Summary: [Anolis23.1][RC1][loongarch64]虚拟机 cpu hotplug后,新增的cpu处于off line状态,手动使能才能online
Status: CONFIRMED
Alias: None
Product: Anolis OS 23
Classification: Anolis OS
Component: BaseOS Modules (show other bugs) BaseOS Modules
Version: 23.1
Hardware: loongarch Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: gaochang
QA Contact: bolong_tbl
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-01-18 19:17 UTC by wuzhiguo
Modified: 2025-02-08 10:44 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description wuzhiguo loongson_group 2024-01-18 19:17:09 UTC
Description of problem:
虚拟机 cpu hotplug后,新增的cpu处于off line状态,手动使能才能online

Version-Release number of selected component (if applicable):
kernel版本 5.10.134-16.2_rc1.an23.loongarch64
qemu版本 6.2.0-4.an23

Steps to Reproduce:
1. 虚拟机启动,命令如下:
MALLOC_PERTURB_=1  /usr/bin/qemu-system-loongarch64 \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine loongson7a,memory-backend=mem-machine_mem \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 2048 \
    -object memory-backend-ram,size=2048M,id=mem-machine_mem  \
    -smp 1,maxcpus=2,cores=2,threads=1,sockets=1  \
    -cpu 'Loongson-3A5000' \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
    -blockdev node-name=file_image1,driver=file,filename=/root/avocado/data/avocado-vt/images/AnolisOS-23.1-loongarch64.qcow2 \
    -blockdev node-name=drive_image1,driver=qcow2,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:71:8a:31:b2:50,id=idGpZRT0,netdev=idFQ18KT,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=idFQ18KT \
    -vnc :0  \
    -rtc base=utc,clock=host  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -bios loongarch_bios.bin \
    -enable-kvm \
    -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
    -serial stdio \
    -monitor telnet:localhost:4444,server,nowait
2. 虚拟机启动后,qemu monitor 进行 cpu hotplug,命令如下:
# telnet localhost 4444
Trying ::1...
Connected to localhost.
Escape character is '^]'.
QEMU 6.2.0 monitor - type 'help' for more information
(qemu) 
(qemu) device_add Loongson-3A5000-loongarch-cpu,id=vcpu1,core-id=1

Actual results:
虚拟机 cpu hotplug后,新增的cpu处于off line状态
# lscpu 
Architecture:           loongarch64
  Byte Order:           Little Endian
CPU(s):                 2
  On-line CPU(s) list:  0
  Off-line CPU(s) list: 1
BIOS Vendor ID:         Loongson
Model name:             -
  BIOS Model name:      Loongson-3A5000-HV  CPU @ 2.0GHz
  BIOS CPU family:      1
  Thread(s) per core:   1
  Core(s) per socket:   1
  Socket(s):            1
  BogoMIPS:             4000.00
  Flags:                cpucfg lam ual fpu complex crypto
Caches (sum of all):    
  L1d:                  64 KiB (1 instance)
  L1i:                  64 KiB (1 instance)
  L2:                   256 KiB (1 instance)
  L3:                   16 MiB (1 instance)
NUMA:                   
  NUMA node(s):         1
  NUMA node0 CPU(s):    0
# cat /sys/devices/system/cpu/cpu1/online 
0 
# echo 1 >  /sys/devices/system/cpu/cpu1/online 
[  734.434973][  T878] Booting CPU#1...
[  734.437981][    T0] 64-bit Loongson Processor probed (LA464 Core)
[  734.439597][    T0] CPU1 revision is: 0014c010 (Loongson-64bit)
[  734.441063][    T0] FPU1 revision is: 00000000
[  734.442279][    T0] CPU#1 finished
[  734.445248][ T1048] pv stealtime: cpu 1, st:0x900000000a0005c0 phys:0xa0005c0
[  734.448890][ T1048] Will online and init hotplugged CPU: 1

Expected results:
cpu hoyplug 后,成功被添加,自动成功上线
Comment 1 lixianglai 2024-01-18 19:22:42 UTC
将anolis23的kernel安装在其他操作上面启动虚拟机,进行cpu hotplug测试,cpu可以上线,由此可以证明此问题与qemu虚拟机和虚拟机内核无关,应与操作系统有关系,且可能与/usr/lib/systemd/systemd-udevd策略有关
Comment 2 wuzhiguo loongson_group 2024-01-18 19:55:50 UTC
x86 AnolisOS-23 也有此问题
Comment 3 gaochang alibaba_cloud_group 2024-04-29 15:23:14 UTC
请使用最新 qemu 8.2 进行再次测试
Comment 4 花豆豆雪 loongson_group 2025-02-08 10:44:10 UTC
在x86平台对比测试如下:
x86平台测试分析过程

测试iso: anolisos-23.2 
内核版本: linux-6.6
qemu版本: 8.2.2
iso下载地址: https://mirrors.openanolis.cn/anolis/23/isos/GA/x86_64/AnolisOS-23.2-x86_64-boot.iso

1. 在x86机器上面创建虚拟机(anolis-23.2)并启动,然后进入该虚拟机内,再次用上述的anolis-23.2 iso创建虚拟机(anolis-23.2-x86-64-test),并制作qcow2文件:anolis-23.2-x86-64-test.qcow2

2. 然后在anolis-23.2虚拟机终端执行如下命令启动anolis-23.2-x86-64-test虚拟机
执行命令:
/usr/bin/qemu-system-x86_64 -M pc-i440fx-8.2 -accel kvm -cpu host -m 2048 -smp 2,maxcpus=4,sockets=1,cores=4,threads=1 -no-user-config -nodefaults -no-shutdown -boot strict=on -drive file=/var/lib/libvirt/images/anolis-23.2-x86-64-test.qcow2,if=virtio -nographic -serial stdio -monitor telnet:localhost:4444,server,nowait -msg timestamp=on

串口进入该虚拟机后执行lscpu查看:
[root@anolis ~]# lscpu
Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          46 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   2
  On-line CPU(s) list:    0,1
Vendor ID:                GenuineIntel
  BIOS Vendor ID:         QEMU
  Model name:             12th Gen Intel(R) Core(TM) i7-12700
    BIOS Model name:      pc-i440fx-8.2  CPU @ 2.0GHz
    BIOS CPU family:      1
    CPU family:           6
    Model:                151
    Thread(s) per core:   1
    Core(s) per socket:   2
    Socket(s):            1
    Stepping:             2
    BogoMIPS:             4224.00

在虚拟机anolis-23.2另一终端执行如下操作,并添加一个cpu

# telnet localhost 4444
Trying ::1...
Connected to localhost.
Escape character is '^]'.
QEMU 8.2.2 monitor - type 'help' for more information
(qemu)
(qemu) device_add host-x86_64-cpu,core-id=2,id=vcpu2,socket-id=0,thread-id=0
(qemu)
再次在虚拟机anolis-23.2-x86-64-test中执行lscpu查看
[root@anolis ~]# lscpu
Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          46 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   3
  On-line CPU(s) list:    0,1
  Off-line CPU(s) list:   2
Vendor ID:                GenuineIntel
  BIOS Vendor ID:         QEMU
  Model name:             12th Gen Intel(R) Core(TM) i7-12700
    BIOS Model name:      pc-i440fx-8.2  CPU @ 2.0GHz
    BIOS CPU family:      1
    CPU family:           6
    Model:                151
    Thread(s) per core:   1
    Core(s) per socket:   2
    Socket(s):            1
    Stepping:             2
    BogoMIPS:             4224.00

到此可知,在x86平台,qemu-8.2.2环境下,通过device_add添加的cpu默认状态也是offline

通过手动方式执行下面操作可以使新添加的cpu online
[root@anolis ~]# cat /sys/devices/system/cpu/cpu2/online
0
[root@anolis ~]# echo 1 > /sys/devices/system/cpu/cpu2/online
[ 1409.868102] smpboot: Booting Node 0 Processor 2 APIC 0x2
[ 1409.891705] Will online and init hotplugged CPU: 2
[root@anolis ~]# lscpu
Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          46 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   3
  On-line CPU(s) list:    0-2
Vendor ID:                GenuineIntel
  BIOS Vendor ID:         QEMU
  Model name:             12th Gen Intel(R) Core(TM) i7-12700
    BIOS Model name:      pc-i440fx-8.2  CPU @ 2.0GHz
    BIOS CPU family:      1
    CPU family:           6
    Model:                151
    Thread(s) per core:   1
    Core(s) per socket:   3
    Socket(s):            1
    Stepping:             2
    BogoMIPS:             4224.00

在虚拟机anolis-23.2终端执行如下操作,删掉添加的cpu
# telnet localhost 4444
Trying ::1...
Connected to localhost.
Escape character is '^]'.
QEMU 8.2.2 monitor - type 'help' for more information
(qemu) device_del vcpu2
(qemu)

然后在虚拟机anolis-23.2-x86-64-test执行lscpu查看
[root@anolis ~]# lscpu
Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          46 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   2
  On-line CPU(s) list:    0,1
Vendor ID:                GenuineIntel
  BIOS Vendor ID:         QEMU
  Model name:             12th Gen Intel(R) Core(TM) i7-12700
    BIOS Model name:      pc-i440fx-8.2  CPU @ 2.0GHz
    BIOS CPU family:      1
    CPU family:           6
    Model:                151
    Thread(s) per core:   1
    Core(s) per socket:   2
    Socket(s):            1
    Stepping:             2
    BogoMIPS:             4224.00


*** 因此可知,通过device_add添加的cpu默认状态在x86和loongarch平台下面都是offline