Created attachment 1430 [details] dumpr日志 Description of problem: 鲲鹏920 7282c 执行LTP的tpci项直接宕机 Version-Release number of selected component (if applicable): 5.10-devel How reproducible: ./runltp -s tpci Steps to Reproduce: 1. 编译5.10-devel,安装 2. 编译5.10LTP 安装 /主线LTP(都可以复现) 3. 进入到LTP安装目录,执行 ./runltp -s tpci Actual results: 宕机 Expected results: PASS Additional info: [ 7156.655608] user pgtable: 4k pages, 48-bit VAs, pgdp=0000284029383000 [ 7156.662078] [00000000000000c8] pgd=00002840212e3003, p4d=00002840212e3003, pud=000028411f945003, pmd=0000000000000000 [ 7156.672759] Internal error: Oops: 96000006 [#1] SMP [ 7156.677639] Modules linked in: ltp_tpci(OE) rfkill(E) ipmi_ssif(E) crct10dif_ce(E) ghash_ce(E) sm4_ce_gcm(E) sm4_ce_ccm(E) sm4_ce(E) sm4_ce_cipher(E) sm3_ce(E) sha1_ce(E) sbsa_gwdt(E) acpi_ipmi(E) hns3_pmu(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) spi_dw_mmio(E) vfat(E) fat(E) xfs(E) libcrc32c(E) sd_mod(E) t10_pi(E) sg(E) uas(E) hibmc_drm(E) drm_vram_helper(E) drm_kms_helper(E) usb_storage(E) syscopyarea(E) sysfillrect(E) sha2_ce(E) sysimgblt(E) fb_sys_fops(E) sha256_arm64(E) cec(E) hns3(E) drm_ttm_helper(E) hisi_sas_v3_hw(E) ttm(E) hisi_sas_main(E) hclge(E) hnae3(E) drm(E) libsas(E) scsi_transport_sas(E) hns_mdio(E) xsc_eth(E) xsc_pci(E) sssnic(E) sssdk(E) [ 7156.736614] CPU: 298 PID: 109409 Comm: tpci Kdump: loaded Tainted: G OE 5.10.134 #1 [ 7156.745478] Hardware name: To be filled by O.E.M. CS5280K3/BC83AMDAE-7282C, BIOS 32.52 09/18/2025 [ 7156.754345] pstate: 40401009 (nZcv daif +PAN -UAO -TCO BTYPE=--) [ 7156.760375] pc : pci_dma_cleanup+0xc/0x38 [ 7156.764395] lr : device_release_driver_internal+0x170/0x210 [ 7156.769960] sp : ffff80005faabb10 [ 7156.773268] x29: ffff80005faabb10 x28: ffff53ad0b2d3c00 [ 7156.778580] x27: 0000000000000000 x26: 0000000000000000 [ 7156.783889] x25: 000000000000000b x24: ffff53ed034af800 [ 7156.789199] x23: ffffa627b3e03c98 x22: 0000000000000000 [ 7156.794508] x21: ffffa627586d3060 x20: ffff33ed298340c8 [ 7156.799817] x19: ffff33ed298330c8 x18: 0000000000000004 [ 7156.805126] x17: 0000000000000000 x16: ffffa627b2d398b0 [ 7156.810435] x15: ffff53ad00059d10 x14: ffff53ad40423348 [ 7156.815744] x13: 0000000000000000 x12: 0000000000000000 [ 7156.821053] x11: ffff53ad40423230 x10: ffff53ad00059d18 [ 7156.826362] x9 : ffffa627b2ea5608 x8 : 0101010101010101 [ 7156.831672] x7 : 7f7f7f7f7f7f7f7f x6 : 0000000000000000 [ 7156.836981] x5 : 0000000000000000 x4 : ffffa627b3807cd8 [ 7156.842290] x3 : 0000000000000000 x2 : 0000000000000000 [ 7156.847599] x1 : 0000000000000000 x0 : ffff33ed298330c8 [ 7156.852909] Call trace: [ 7156.855356] pci_dma_cleanup+0xc/0x38 [ 7156.859015] driver_detach+0xb0/0x180 [ 7156.862682] bus_remove_driver+0x84/0xf8 [ 7156.866600] driver_unregister+0x34/0x68 [ 7156.870520] pci_unregister_driver+0x28/0x120 [ 7156.874885] sys_tcase+0x4f4/0x798 [ltp_tpci] [ 7156.879237] dev_attr_store+0x1c/0x38 [ 7156.882907] sysfs_kf_write+0x48/0x60 [ 7156.886567] kernfs_fop_write_iter+0x138/0x1d0 [ 7156.891019] new_sync_write+0x108/0x190 [ 7156.894848] vfs_write+0x1e0/0x2b0 [ 7156.898243] ksys_write+0x6c/0x100 [ 7156.901641] __arm64_sys_write+0x20/0x30 [ 7156.905572] el0_svc_common.constprop.4+0xb4/0x210 [ 7156.910360] do_el0_svc+0x80/0x90 [ 7156.913686] el0_svc+0x1c/0x28 [ 7156.916742] el0_sync_handler+0x88/0xb0 [ 7156.920575] el0_sync+0x148/0x180 [ 7156.923888] Code: ffffa627 aa1e03e9 d503201f f9404401 (39432021) [ 7156.929984] ---[ end trace 45225384afa05e51 ]--- [ 7156.934598] Kernel panic - not syncing: Oops: Fatal exception [ 7156.940340] SMP: stopping secondary CPUs [ 7157.075491] Kernel Offset: 0x2627aa600000 from 0xffff800008000000 [ 7157.219810] PHYS_OFFSET: 0xffffd49300000000 [ 7157.470927] CPU features: 0x1,b1870817,7a600838 [ 7157.475449] Memory Limit: none [ 7157.483524] Starting crashdump kernel... [ 7157.487441] Bye!
浪潮之前回合的 bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management 补丁有问题, upstream 的修改是在 dev->driver 被置为 NULL 之前调用的 dma_cleanup, 你们回合的时候搞成了在 dev->driver 置 NULL 之后调用 dma_cleanup 了, dma_cleanup 里面会访问 dev->driver 就crash 了。
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/6184
(In reply to GuixinLiu from comment #1) > 浪潮之前回合的 bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management > 补丁有问题, upstream 的修改是在 dev->driver 被置为 NULL 之前调用的 dma_cleanup, 你们回合的时候搞成了在 > dev->driver 置 NULL 之后调用 dma_cleanup 了, dma_cleanup 里面会访问 dev->driver 就crash > 了。 龙蜥原有的代码这里顺序和上游不一样,合入的时候美有仔细看,新添加dma_cleanup代码的位置定位存在偏移。
已确认修复