Bug 25500 - 解决riscv iommu 启动出现的race问题
Summary: 解决riscv iommu 启动出现的race问题
Status: NEW
Alias: None
Product: ANCK 6.6 Dev
Classification: ANCK
Component: drivers (show other bugs) drivers
Version: 6.6.y-3
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: GuixinLiu
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-09-25 17:18 UTC by gaorui
Modified: 2025-09-25 17:18 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description gaorui 2025-09-25 17:18:01 UTC
Description of problem:

iommu probe 会为系统中添加的每个设备调用,且调用顺序与设备添加的顺序一致。如果以随机顺序调用,或者根据驱动程序绑定与否来决定是否调用,就会发生各种意外的并发和竞态条件。


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.启动uefi镜像,配置virtio-scsi和iommu,镜像使用systemd作为1号进程,

/qemu-system-riscv64
-nographic -M virt,pflash0=pflash0,pflash1=pflash1,acpi=off,aia=aplic-imsic,aia-guests=5
-blockdev node-name=pflash0,driver=file,read-only=on,filename=./RISCV_VIRT_CODE.fd
-blockdev node-name=pflash1,driver=file,filename=./RISCV_VIRT_VARS.fd -accel tcg -smp 32 -m 8G
-drive file=./rv.qcow2,format=qcow2,id=hd0,if=none -object rng-random,filename=/dev/urandom,id=rng0
-device virtio-rng-device,rng=rng0 -device virtio-blk-device,drive=hd0,bootindex=1
-device '{"driver":"pcie-root-port","port":8,"chassis":1,"id":"pci.1","bus":"pcie.0","multifunction":true,"addr":"0x1"}'
-device '{"driver":"pcie-root-port","port":9,"chassis":2,"id":"pci.2","bus":"pcie.0","addr":"0x1.0x1"}'
-device '{"driver":"pcie-root-port","port":10,"chassis":3,"id":"pci.3","bus":"pcie.0","addr":"0x1.0x2"}'
-device '{"driver":"pcie-root-port","port":11,"chassis":4,"id":"pci.4","bus":"pcie.0","addr":"0x1.0x3"}'
-device '{"driver":"pcie-pci-bridge","id":"pci.5","bus":"pci.1","addr":"0x0"}'
-device '{"driver":"pcie-root-port","port":12,"chassis":6,"id":"pci.6","bus":"pcie.0","addr":"0x1.0x4"}'
-device '{"driver":"pcie-root-port","port":13,"chassis":7,"id":"pci.7","bus":"pcie.0","addr":"0x1.0x5"}'
-device '{"driver":"pcie-root-port","port":14,"chassis":8,"id":"pci.8","bus":"pcie.0","addr":"0x1.0x6"}'
-device '{"driver":"pcie-root-port","port":15,"chassis":9,"id":"pci.9","bus":"pcie.0","addr":"0x1.0x7"}'
-device '{"driver":"pcie-root-port","port":16,"chassis":10,"id":"pci.10","bus":"pcie.0","multifunction":true,"addr":"0x2"}'
-device '{"driver":"pcie-root-port","port":17,"chassis":11,"id":"pci.11","bus":"pcie.0","addr":"0x2.0x1"}'
-device '{"driver":"pcie-root-port","port":18,"chassis":12,"id":"pci.12","bus":"pcie.0","addr":"0x2.0x2"}'
-device '{"driver":"usb-ehci","id":"usb","bus":"pci.5","addr":"0x1"}'
-device '{"driver":"virtio-serial-pci","id":"virtio-serial0","bus":"pci.6","addr":"0x0"}'
-netdev user,id=hostnet0
-netdev user,id=hostnet1
-device '{"driver":"virtio-net-pci","netdev":"hostnet0","id":"net0","mac":"24:9e:4e:ba:6f:32","bus":"pci.2","addr":"0x0"}'
-device '{"driver":"virtio-net-pci","netdev":"hostnet1","id":"net1","mac":"24:9e:4e:ba:6f:38","bus":"pci.3","addr":"0x0"}'
-device riscv-iommu-pci,vendor-id=0x1efd,device-id=0x8,bus=pcie.0,addr=0xf \

2.系统启动后进入救援模式,其udev-worker线程无响应
[root@test-pc ~]# ps aux | grep D
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 139 0.1 0.0 0 0 ? D 07:59 0:00 [kworker/u32:6+async]
root 910 2.2 0.2 36136 8520 ? D 08:01 0:00 (udev-worker)
root 940 1.9 0.1 36080 6708 ? D 08:01 0:00 (udev-worker)
root 978 0.2 0.0 12980 2720 ? D 08:01 0:00 /sbin/modprobe -q -- fs-vfat
root 1010 5.1 0.0 41940 2740 ? D 08:01 0:01 /sbin/modprobe -q -- fs-rpc_pipefs
root 1082 33.3 0.0 221968 3580 ttyS0 S+ 08:01 0:00 grep --color=auto D
[root@test-pc ~]# cat /proc/139/stack
[<0>] blk_execute_rq+0xc2/0x130
[<0>] scsi_execute_cmd+0x9c/0x212
[<0>] scsi_probe_lun.constprop.0+0x13c/0x400
[<0>] scsi_probe_and_add_lun+0x9c/0x39e
[<0>] __scsi_scan_target+0xbc/0x1cc
[<0>] scsi_scan_channel+0x4a/0x78
[<0>] scsi_scan_host_selected+0xc6/0x118
[<0>] do_scsi_scan_host+0x66/0x6e
[<0>] do_scan_async+0x12/0x1ac
[<0>] async_run_entry_fn+0x28/0x12a
[<0>] process_one_work+0x152/0x348
[<0>] worker_thread+0x1e0/0x2b8
[<0>] kthread+0xcc/0xec
[<0>] ret_from_fork+0xe/0x18
[root@test-pc ~]# cat /proc/910/stack
[<0>] async_synchronize_cookie_domain+0xb4/0xec
[<0>] async_synchronize_full+0x10/0x18
[<0>] do_init_module+0x144/0x1d4
[<0>] load_module+0x6da/0x840
[<0>] init_module_from_file+0x76/0xb0
[<0>] idempotent_init_module+0x198/0x27a
[<0>] __riscv_sys_finit_module+0x54/0xa6
[<0>] do_trap_ecall_u+0x1da/0x224
[<0>] handle_exception+0x150/0x15c

3.

Actual results:


Expected results:


Additional info: