Bug 27979 - 5.10-devel + 鲲鹏920 7282C 执行LTP的tpci项直接宕机
Summary: 5.10-devel + 鲲鹏920 7282C 执行LTP的tpci项直接宕机
Status: RESOLVED FIXED
Alias: None
Product: ANCK 5.10 Dev
Classification: ANCK
Component: ARM (show other bugs) ARM
Version: unspecified
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: baolinwang
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-12-10 14:08 UTC by inspursand
Modified: 2025-12-24 08:53 UTC (History)
2 users (show)

See Also:


Attachments
dumpr日志 (296.95 KB, text/plain)
2025-12-10 14:08 UTC, inspursand
Details

Note You need to log in before you can comment on or make changes to this bug.
Description inspursand inspur_group 2025-12-10 14:08:57 UTC
Created attachment 1430 [details]
dumpr日志

Description of problem:

鲲鹏920 7282c 执行LTP的tpci项直接宕机
Version-Release number of selected component (if applicable):
5.10-devel

How reproducible:
./runltp -s tpci

Steps to Reproduce:
1. 编译5.10-devel,安装
2. 编译5.10LTP 安装 /主线LTP(都可以复现)
3. 进入到LTP安装目录,执行 ./runltp -s tpci

Actual results:
宕机
Expected results:
PASS

Additional info:
[ 7156.655608] user pgtable: 4k pages, 48-bit VAs, pgdp=0000284029383000
[ 7156.662078] [00000000000000c8] pgd=00002840212e3003, p4d=00002840212e3003, pud=000028411f945003, pmd=0000000000000000
[ 7156.672759] Internal error: Oops: 96000006 [#1] SMP
[ 7156.677639] Modules linked in: ltp_tpci(OE) rfkill(E) ipmi_ssif(E) crct10dif_ce(E) ghash_ce(E) sm4_ce_gcm(E) sm4_ce_ccm(E) sm4_ce(E) sm4_ce_cipher(E) sm3_ce(E) sha1_ce(E) sbsa_gwdt(E) acpi_ipmi(E) hns3_pmu(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) spi_dw_mmio(E) vfat(E) fat(E) xfs(E) libcrc32c(E) sd_mod(E) t10_pi(E) sg(E) uas(E) hibmc_drm(E) drm_vram_helper(E) drm_kms_helper(E) usb_storage(E) syscopyarea(E) sysfillrect(E) sha2_ce(E) sysimgblt(E) fb_sys_fops(E) sha256_arm64(E) cec(E) hns3(E) drm_ttm_helper(E) hisi_sas_v3_hw(E) ttm(E) hisi_sas_main(E) hclge(E) hnae3(E) drm(E) libsas(E) scsi_transport_sas(E) hns_mdio(E) xsc_eth(E) xsc_pci(E) sssnic(E) sssdk(E)
[ 7156.736614] CPU: 298 PID: 109409 Comm: tpci Kdump: loaded Tainted: G           OE     5.10.134 #1
[ 7156.745478] Hardware name: To be filled by O.E.M. CS5280K3/BC83AMDAE-7282C, BIOS 32.52 09/18/2025
[ 7156.754345] pstate: 40401009 (nZcv daif +PAN -UAO -TCO BTYPE=--)
[ 7156.760375] pc : pci_dma_cleanup+0xc/0x38
[ 7156.764395] lr : device_release_driver_internal+0x170/0x210
[ 7156.769960] sp : ffff80005faabb10
[ 7156.773268] x29: ffff80005faabb10 x28: ffff53ad0b2d3c00 
[ 7156.778580] x27: 0000000000000000 x26: 0000000000000000 
[ 7156.783889] x25: 000000000000000b x24: ffff53ed034af800 
[ 7156.789199] x23: ffffa627b3e03c98 x22: 0000000000000000 
[ 7156.794508] x21: ffffa627586d3060 x20: ffff33ed298340c8 
[ 7156.799817] x19: ffff33ed298330c8 x18: 0000000000000004 
[ 7156.805126] x17: 0000000000000000 x16: ffffa627b2d398b0 
[ 7156.810435] x15: ffff53ad00059d10 x14: ffff53ad40423348 
[ 7156.815744] x13: 0000000000000000 x12: 0000000000000000 
[ 7156.821053] x11: ffff53ad40423230 x10: ffff53ad00059d18 
[ 7156.826362] x9 : ffffa627b2ea5608 x8 : 0101010101010101 
[ 7156.831672] x7 : 7f7f7f7f7f7f7f7f x6 : 0000000000000000 
[ 7156.836981] x5 : 0000000000000000 x4 : ffffa627b3807cd8 
[ 7156.842290] x3 : 0000000000000000 x2 : 0000000000000000 
[ 7156.847599] x1 : 0000000000000000 x0 : ffff33ed298330c8 
[ 7156.852909] Call trace:
[ 7156.855356]  pci_dma_cleanup+0xc/0x38
[ 7156.859015]  driver_detach+0xb0/0x180
[ 7156.862682]  bus_remove_driver+0x84/0xf8
[ 7156.866600]  driver_unregister+0x34/0x68
[ 7156.870520]  pci_unregister_driver+0x28/0x120
[ 7156.874885]  sys_tcase+0x4f4/0x798 [ltp_tpci]
[ 7156.879237]  dev_attr_store+0x1c/0x38
[ 7156.882907]  sysfs_kf_write+0x48/0x60
[ 7156.886567]  kernfs_fop_write_iter+0x138/0x1d0
[ 7156.891019]  new_sync_write+0x108/0x190
[ 7156.894848]  vfs_write+0x1e0/0x2b0
[ 7156.898243]  ksys_write+0x6c/0x100
[ 7156.901641]  __arm64_sys_write+0x20/0x30
[ 7156.905572]  el0_svc_common.constprop.4+0xb4/0x210
[ 7156.910360]  do_el0_svc+0x80/0x90
[ 7156.913686]  el0_svc+0x1c/0x28
[ 7156.916742]  el0_sync_handler+0x88/0xb0
[ 7156.920575]  el0_sync+0x148/0x180
[ 7156.923888] Code: ffffa627 aa1e03e9 d503201f f9404401 (39432021) 
[ 7156.929984] ---[ end trace 45225384afa05e51 ]---
[ 7156.934598] Kernel panic - not syncing: Oops: Fatal exception
[ 7156.940340] SMP: stopping secondary CPUs
[ 7157.075491] Kernel Offset: 0x2627aa600000 from 0xffff800008000000
[ 7157.219810] PHYS_OFFSET: 0xffffd49300000000
[ 7157.470927] CPU features: 0x1,b1870817,7a600838
[ 7157.475449] Memory Limit: none
[ 7157.483524] Starting crashdump kernel...
[ 7157.487441] Bye!
Comment 1 GuixinLiu alibaba_cloud_group 2025-12-10 15:14:56 UTC
浪潮之前回合的 bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management 补丁有问题, upstream 的修改是在 dev->driver 被置为 NULL 之前调用的 dma_cleanup, 你们回合的时候搞成了在 dev->driver 置 NULL 之后调用 dma_cleanup 了, dma_cleanup 里面会访问 dev->driver 就crash 了。
Comment 2 小龙 admin 2025-12-10 16:52:42 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/6184
Comment 3 chuguangqing inspur_group 2025-12-24 08:52:53 UTC
(In reply to GuixinLiu from comment #1)
> 浪潮之前回合的 bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
> 补丁有问题, upstream 的修改是在 dev->driver 被置为 NULL 之前调用的 dma_cleanup, 你们回合的时候搞成了在
> dev->driver 置 NULL 之后调用 dma_cleanup 了, dma_cleanup 里面会访问 dev->driver 就crash
> 了。

龙蜥原有的代码这里顺序和上游不一样,合入的时候美有仔细看,新添加dma_cleanup代码的位置定位存在偏移。
Comment 4 chuguangqing inspur_group 2025-12-24 08:53:45 UTC
已确认修复