Bug 18746 - 5.10.134-010.ali5000.pro.al8内核热nvme卡插拔后内存分配失败
Summary: 5.10.134-010.ali5000.pro.al8内核热nvme卡插拔后内存分配失败
Status: RESOLVED FIXED
Alias: None
Product: ANCK 5.10 Dev
Classification: ANCK
Component: drivers (show other bugs) drivers
Version: unspecified
Hardware: x86_64 Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: hobbit
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-02-12 10:59 UTC by chuguangqing
Modified: 2025-02-19 15:42 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description chuguangqing inspur_group 2025-02-12 10:59:24 UTC
Description of problem:
5.10.134-010.ali5000.pro.al8内核热nvme卡插拔后内存分配失败

插拔前Memory behind bridge:是201M,插拔后是disabled

4a:03.2 PCI bridge: Chengdu Haiguang IC Design Co., Ltd. Device 1483 (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin ? routed to IRQ 46
	NUMA node: 1
	IOMMU group: 56
	Bus: primary=4a, secondary=68, subordinate=84, sec-latency=0
	I/O behind bridge: 0000f000-00000fff [disabled]
	Memory behind bridge: fff00000-000fffff [disabled]
	Prefetchable memory behind bridge: 00001ffb3f000000-00001ffb7effffff [size=1G]

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.拔掉位于pci bridge后的nvme卡
2.插上位于pci bridge后的nvme卡
3.查看pci bridge的io内存 lspci -vvvs 4a:03.2

Actual results:
Memory behind bridge: fff00000-000fffff [disabled]

Expected results:

Memory behind bridge: ab400000-b7cfffff [size=201M]
Additional info:

[380997.559501] pcieport 0000:4a:03.2: pciehp: pending interrupts 0x0100 from Slot Status
[380997.559625] pcieport 0000:4a:03.2: pciehp: Slot(1): Link Down
[380997.559669] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 512
[380997.566164] pcieport 0000:4a:03.2: pciehp: pciehp_get_power_status: SLOTCTRL 70 value read 1128
[380997.575665] {1}[Hardware Error]: It has been corrected by h/w and requires no further action
[380997.575667] {1}[Hardware Error]: event severity: corrected
[380997.575670] {1}[Hardware Error]:  Error 0, type: corrected
[380997.575672] {1}[Hardware Error]:  fru_text: PcieError
[380997.575673] {1}[Hardware Error]:   section_type: PCIe error
[380997.575679] {1}[Hardware Error]:   port_type: 4, root port
[380997.585203] pcieport 0000:4a:03.2: pciehp: pciehp_unconfigure_device: domain:bus:dev = 0000:68:00
[380997.591405] {1}[Hardware Error]:   version: 0.2
[380997.591407] {1}[Hardware Error]:   command: 0x0507, status: 0x0010
[380997.591409] {1}[Hardware Error]:   device_id: 0000:4a:03.2
[380997.591410] {1}[Hardware Error]:   slot: 8
[380997.591411] {1}[Hardware Error]:   secondary_bus: 0x68
[380997.591413] {1}[Hardware Error]:   vendor_id: 0x1d94, device_id: 0x1483
[380997.591415] {1}[Hardware Error]:   class_code: 060400
[380997.591420] {1}[Hardware Error]:   bridge: secondary_status: 0x2000, control: 0x0002
[380997.666952] pcieport 0000:4a:03.2: AER: aer_status: 0x00002001, aer_mask: 0x00003100
[380997.675712] pcieport 0000:4a:03.2:    [ 0] RxErr                  (First)
[380997.683391] pcieport 0000:4a:03.2: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
[380998.167830] pcieport 0000:4a:03.2: pciehp: pending interrupts 0x0008 from Slot Status
[380998.809713] pcieport 0000:69:00.0: pciehp: Timeout on hotplug command 0x0000 (issued 1212 msec ago)
[380998.819923] pcieport 0000:69:00.0: pciehp: pcie_disable_notification: SLOTCTRL 80 write cmd 0
[380998.820000] pci_bus 0000:6a: dev 00, dec refcount to 0
[380998.820075] pci_bus 0000:6a: dev 00, released physical slot 0
[380998.820534] pci_bus 0000:6a: busn_res: [bus 6a] is released
[380998.827178] pci 0000:69:00.0: Removing from iommu group 64
[380998.833465] pci_bus 0000:69: busn_res: [bus 69-6a] is released
[380998.840336] pci 0000:68:00.0: EDR: Notify handler removed
[380998.840465] pci 0000:68:00.0: Removing from iommu group 63
[380998.846764] pcieport 0000:4a:03.2: pciehp: pciehp_power_off_slot: SLOTCTRL 70 write cmd 400
[380999.863725] pcieport 0000:4a:03.2: pciehp: pciehp_set_indicators: SLOTCTRL 70 write cmd 300
[380999.863730] pcieport 0000:4a:03.2: pciehp: pciehp_check_link_active: lnk_status = d081
[381018.781704] pcieport 0000:4a:03.2: pciehp: pending interrupts 0x0008 from Slot Status
[381018.781772] pcieport 0000:4a:03.2: pciehp: pciehp_check_link_active: lnk_status = d081
[381018.781775] pcieport 0000:4a:03.2: pciehp: Slot(1): Card present
[381018.788597] pcieport 0000:4a:03.2: pciehp: pciehp_get_power_status: SLOTCTRL 70 value read 1728
[381018.788601] pcieport 0000:4a:03.2: pciehp: pciehp_power_on_slot: SLOTCTRL 70 write cmd 0
[381018.788603] pcieport 0000:4a:03.2: pciehp: __pciehp_link_set: lnk_ctrl = 0
[381018.788605] pcieport 0000:4a:03.2: pciehp: pciehp_set_indicators: SLOTCTRL 70 write cmd 200
[381018.829034] pcieport 0000:4a:03.2: pciehp: pending interrupts 0x0100 from Slot Status
[381018.935720] pcieport 0000:4a:03.2: pciehp: pciehp_check_link_status: lnk_status = f083
[381018.935764] pci 0000:68:00.0: [10b5:8733] type 01 class 0x060400
[381018.942588] pci 0000:68:00.0: reg 0x10: [mem 0x00000000-0x0003ffff]
[381018.949718] pci 0000:68:00.0: Max Payload Size set to 512 (was 128, max 2048)
[381018.957872] pci 0000:68:00.0: PME# supported from D0 D3hot D3cold
[381018.964834] pci 0000:68:00.0: 63.008 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x8 link at 0000:4a:03.2 (capable of 126.016 Gb/s with 8.0 GT/s PCIe x16 link)
[381018.981994] pci 0000:68:00.0: EDR: Notify handler installed
[381018.982439] pci 0000:68:00.0: Adding to iommu group 63
[381018.990745] pci 0000:68:00.0: scanning [bus 00-00] behind bridge, pass 0
[381018.990748] pci 0000:68:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[381018.999798] pci 0000:68:00.0: scanning [bus 00-00] behind bridge, pass 1
[381018.999954] pci_bus 0000:69: scanning bus
[381018.999974] pci 0000:69:00.0: [10b5:8733] type 01 class 0x060400
[381019.006821] pci 0000:69:00.0: Max Payload Size set to 512 (was 2048, max 2048)
[381019.015056] pci 0000:69:00.0: PME# supported from D0 D3hot D3cold
[381019.022089] pci 0000:69:00.0: Adding to iommu group 64
[381019.027979] pci_bus 0000:69: fixups for bus
[381019.027980] pci 0000:68:00.0: PCI bridge to [bus 69-84]
[381019.033929] pci 0000:69:00.0: scanning [bus 00-00] behind bridge, pass 0
[381019.033931] pci 0000:69:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[381019.042976] pci 0000:69:00.0: scanning [bus 00-00] behind bridge, pass 1
[381019.043034] pci_bus 0000:6a: scanning bus
[381019.043037] pci_bus 0000:6a: fixups for bus
[381019.043038] pci 0000:69:00.0: PCI bridge to [bus 6a-84]
[381019.048988] pci_bus 0000:6a: [bus 6a-84] extended by 0x1a
[381019.048989] pci_bus 0000:6a: bus scan returning with max=84
[381019.048992] pci_bus 0000:6a: busn_res: [bus 6a-84] end is updated to 84
[381019.056486] pci_bus 0000:69: bus scan returning with max=84
[381019.056487] pci_bus 0000:69: busn_res: [bus 69-84] end is updated to 84
[381019.063991] pci 0000:69:00.0: bridge window [io  0x1000-0x0fff] to [bus 6a-84] add_size 1000
[381019.073524] pci 0000:69:00.0: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 6a-84] add_size 200000 add_align 100000
[381019.086738] pci 0000:69:00.0: bridge window [mem 0x00100000-0x000fffff] to [bus 6a-84] add_size 200000 add_align 100000
[381019.098889] pci 0000:68:00.0: bridge window [io  0x1000-0x0fff] to [bus 69-84] add_size 1000
[381019.108418] pci 0000:68:00.0: bridge window [mem 0x00100000-0x001fffff 64bit pref] to [bus 69-84] add_size 200000 add_align 100000
[381019.121640] pci 0000:68:00.0: bridge window [mem 0x00100000-0x001fffff] to [bus 69-84] add_size 200000 add_align 100000
[381019.133777] pcieport 0000:4a:03.2: bridge window [io  0x1000-0x0fff] to [bus 68-84] add_size 2000
[381019.143794] pci 0000:68:00.0: bridge window [mem 0x00100000-0x001fffff] extended by 0x000000000c800000
[381019.143798] pci 0000:68:00.0: bridge window [mem 0x00100000-0x001fffff 64bit pref] extended by 0x000000003ff00000
[381019.143803] pci 0000:69:00.0: bridge window [mem 0x00100000-0x000fffff] extended by 0x000000000c900000
[381019.143804] pci 0000:69:00.0: bridge window [mem 0x00100000-0x000fffff 64bit pref] extended by 0x0000000040000000
[381019.143809] pcieport 0000:4a:03.2: BAR 13: no space for [io  size 0x2000]
[381019.151492] pcieport 0000:4a:03.2: BAR 13: failed to assign [io  size 0x2000]
[381019.159562] pcieport 0000:4a:03.2: BAR 13: no space for [io  size 0x2000]
[381019.167243] pcieport 0000:4a:03.2: BAR 13: failed to assign [io  size 0x2000]
[381019.175319] pci 0000:68:00.0: BAR 14: assigned [mem 0xab400000-0xb7cfffff]
[381019.183096] pci 0000:68:00.0: BAR 15: assigned [mem 0x1ffb3f000000-0x1ffb7effffff 64bit pref]
[381019.192720] pci 0000:68:00.0: BAR 0: no space for [mem size 0x00040000]
[381019.200212] pci 0000:68:00.0: BAR 0: failed to assign [mem size 0x00040000]
[381019.208090] pci 0000:68:00.0: BAR 13: no space for [io  size 0x1000]
[381019.215288] pci 0000:68:00.0: BAR 13: failed to assign [io  size 0x1000]
[381019.222874] pci 0000:68:00.0: BAR 14: assigned [mem 0xab400000-0xb7cfffff]
[381019.230660] pci 0000:68:00.0: BAR 0: no space for [mem size 0x00040000]
[381019.238154] pci 0000:68:00.0: BAR 0: failed to assign [mem size 0x00040000]
[381019.246035] pci 0000:68:00.0: BAR 13: no space for [io  size 0x1000]
[381019.253246] pci 0000:68:00.0: BAR 13: failed to assign [io  size 0x1000]
[381019.260837] pci 0000:69:00.0: BAR 14: assigned [mem 0xab400000-0xb7cfffff]
[381019.268623] pci 0000:69:00.0: BAR 15: assigned [mem 0x1ffb3f000000-0x1ffb7effffff 64bit pref]
[381019.278250] pci 0000:69:00.0: BAR 13: no space for [io  size 0x1000]
[381019.285452] pci 0000:69:00.0: BAR 13: failed to assign [io  size 0x1000]
[381019.293040] pci 0000:69:00.0: BAR 13: no space for [io  size 0x1000]
[381019.300240] pci 0000:69:00.0: BAR 13: failed to assign [io  size 0x1000]
[381019.307827] pci 0000:69:00.0: PCI bridge to [bus 6a-84]
[381019.313762] pci 0000:69:00.0:   bridge window [mem 0xab400000-0xb7cfffff]
[381019.321452] pci 0000:69:00.0:   bridge window [mem 0x1ffb3f000000-0x1ffb7effffff 64bit pref]
[381019.330982] pci 0000:68:00.0: PCI bridge to [bus 69-84]
[381019.336923] pci 0000:68:00.0:   bridge window [mem 0xab400000-0xb7cfffff]
[381019.344610] pci 0000:68:00.0:   bridge window [mem 0x1ffb3f000000-0x1ffb7effffff 64bit pref]
[381019.354139] pcieport 0000:4a:03.2: PCI bridge to [bus 68-84]
[381019.360568] pcieport 0000:4a:03.2:   bridge window [mem 0xab400000-0xb7cfffff]
[381019.368738] pcieport 0000:4a:03.2:   bridge window [mem 0x1ffb3f000000-0x1ffb7effffff 64bit pref]
[381019.378754] PCI: No. 2 try to assign unassigned res
[381019.378759] pci 0000:69:00.0: resource 14 [mem 0xab400000-0xb7cfffff] released
[381019.386935] pci 0000:69:00.0: PCI bridge to [bus 6a-84]
[381019.392879] pci 0000:68:00.0: resource 14 [mem 0xab400000-0xb7cfffff] released
[381019.401053] pci 0000:68:00.0: PCI bridge to [bus 69-84]
[381019.406994] pcieport 0000:4a:03.2: resource 14 [mem 0xab400000-0xb7cfffff] released
[381019.415649] pcieport 0000:4a:03.2: PCI bridge to [bus 68-84]
[381019.422065] pci 0000:69:00.0: bridge window [io  0x1000-0x0fff] to [bus 6a-84] add_size 1000
[381019.431589] pci 0000:68:00.0: bridge window [io  0x1000-0x0fff] to [bus 69-84] add_size 1000
[381019.441115] pcieport 0000:4a:03.2: bridge window [io  0x1000-0x0fff] to [bus 68-84] add_size 2000
[381019.451125] pci 0000:68:00.0: bridge window [mem 0x00100000-0x0c9fffff] extended by 0x0000000000100000
[381019.451127] pci 0000:69:00.0: bridge window [mem 0x00100000-0x0c9fffff] extended by 0x0000000000100000
[381019.451135] pcieport 0000:4a:03.2: BAR 14: no space for [mem size 0x0ca00000]
[381019.459197] pcieport 0000:4a:03.2: BAR 14: failed to assign [mem size 0x0ca00000]
[381019.467653] pcieport 0000:4a:03.2: BAR 13: no space for [io  size 0x2000]
[381019.475337] pcieport 0000:4a:03.2: BAR 13: failed to assign [io  size 0x2000]
[381019.483407] pcieport 0000:4a:03.2: BAR 14: no space for [mem size 0x0ca00000]
[381019.491478] pcieport 0000:4a:03.2: BAR 14: failed to assign [mem size 0x0ca00000]
[381019.499942] pcieport 0000:4a:03.2: BAR 13: no space for [io  size 0x2000]
[381019.507628] pcieport 0000:4a:03.2: BAR 13: failed to assign [io  size 0x2000]
[381019.515715] pci 0000:68:00.0: BAR 14: no space for [mem size 0x0ca00000]
[381019.523301] pci 0000:68:00.0: BAR 14: failed to assign [mem size 0x0ca00000]
[381019.531280] pci 0000:68:00.0: BAR 0: no space for [mem size 0x00040000]
[381019.538772] pci 0000:68:00.0: BAR 0: failed to assign [mem size 0x00040000]
[381019.546656] pci 0000:68:00.0: BAR 13: no space for [io  size 0x1000]
[381019.553861] pci 0000:68:00.0: BAR 13: failed to assign [io  size 0x1000]
[381019.561453] pci 0000:68:00.0: BAR 14: no space for [mem size 0x0ca00000]
[381019.569041] pci 0000:68:00.0: BAR 14: failed to assign [mem size 0x0ca00000]
[381019.577017] pci 0000:68:00.0: BAR 0: no space for [mem size 0x00040000]
[381019.584508] pci 0000:68:00.0: BAR 0: failed to assign [mem size 0x00040000]
[381019.592386] pci 0000:68:00.0: BAR 13: no space for [io  size 0x1000]
[381019.599584] pci 0000:68:00.0: BAR 13: failed to assign [io  size 0x1000]
[381019.607173] pci 0000:69:00.0: BAR 14: no space for [mem size 0x0ca00000]
[381019.614758] pci 0000:69:00.0: BAR 14: failed to assign [mem size 0x0ca00000]
[381019.622730] pci 0000:69:00.0: BAR 13: no space for [io  size 0x1000]
[381019.629929] pci 0000:69:00.0: BAR 13: failed to assign [io  size 0x1000]
[381019.637516] pci 0000:69:00.0: BAR 14: no space for [mem size 0x0ca00000]
[381019.645104] pci 0000:69:00.0: BAR 14: failed to assign [mem size 0x0ca00000]
[381019.653080] pci 0000:69:00.0: BAR 13: no space for [io  size 0x1000]
[381019.660280] pci 0000:69:00.0: BAR 13: failed to assign [io  size 0x1000]
[381019.667869] pci 0000:69:00.0: PCI bridge to [bus 6a-84]
[381019.673815] pci 0000:69:00.0:   bridge window [mem 0x1ffb3f000000-0x1ffb7effffff 64bit pref]
[381019.683344] pci 0000:68:00.0: PCI bridge to [bus 69-84]
[381019.689287] pci 0000:68:00.0:   bridge window [mem 0x1ffb3f000000-0x1ffb7effffff 64bit pref]
[381019.698818] pcieport 0000:4a:03.2: PCI bridge to [bus 68-84]
[381019.705247] pcieport 0000:4a:03.2:   bridge window [mem 0x1ffb3f000000-0x1ffb7effffff 64bit pref]
[381019.715302] pcieport 0000:68:00.0: runtime IRQ mapping not provided by arch
[381019.715319] pcieport 0000:68:00.0: enabling device (0000 -> 0002)
[381019.722505] pcieport 0000:69:00.0: runtime IRQ mapping not provided by arch
[381019.722518] pcieport 0000:68:00.0: enabling bus mastering
[381019.722523] pcieport 0000:69:00.0: enabling device (0000 -> 0002)
[381019.729475] pcieport 0000:69:00.0: enabling bus mastering
[381019.729850] pcieport 0000:69:00.0: pciehp: Slot Capabilities      : 0x00000060
[381019.729853] pcieport 0000:69:00.0: pciehp: Slot Status            : 0x0000
[381019.729855] pcieport 0000:69:00.0: pciehp: Slot Control           : 0x0108
[381019.729860] pcieport 0000:69:00.0: pciehp: Slot #0 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl- IbPresDis- LLActRep+
[381019.745024] pci_bus 0000:6a: dev 00, created physical slot 0
[381019.745168] pcieport 0000:69:00.0: pciehp: pcie_enable_notification: SLOTCTRL 80 write cmd 1038
[381019.745191] pcieport 0000:69:00.0: pciehp: pending interrupts 0x0010 from Slot Status
[381019.745208] pcieport 0000:69:00.0: pciehp: pciehp_check_link_active: lnk_status = 1
[381019.745325] pcieport 0000:4a:03.2: pciehp: pciehp_set_indicators: SLOTCTRL 70 write cmd 100
Comment 1 小龙 admin 2025-02-18 10:20:44 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/4634
Comment 2 chuguangqing inspur_group 2025-02-19 15:42:21 UTC
合入后资源能分配到200M了
https://gitee.com/anolis/cloud-kernel/pulls/4634