Bug 8740 - pcie按键热拔时,电源指示灯灭立即拔盘导致卸载失败;若继续热插,则无法识别插入的磁盘
Summary: pcie按键热拔时,电源指示灯灭立即拔盘导致卸载失败;若继续热插,则无法识别插入的磁盘
Status: NEW
Alias: None
Product: ANCK 5.10 Dev
Classification: ANCK
Component: drivers (show other bugs) drivers
Version: 5.10.y-19
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: GuixinLiu
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-04-11 19:40 UTC by fangyumeng
Modified: 2024-06-12 10:20 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description fangyumeng 2024-04-11 19:40:21 UTC
1. Description of problem:
(1)该问题发生在按键热拔事件中,由于remove_board函数中先执行pciehp_set_indicators设置电源指示灯熄灭,再执行pciehp_unconfigure_device进行卸载。考虑到某些外设卸载时间较长(需要5-9秒),因此电源指示灯熄灭后用户立即拔盘,就会因为OS内卸载流程未完成导致如下问题:
(a)按键热拔时,按键后,电源指示灯闪烁5秒熄灭,此时立即拔盘。系统日志会报错“PCIe Bus Error”,导致OS内卸载进程阻塞在i40e_remove函数,这时候卸载进程还未释放pci互斥锁(mutex)。与此同时,OS内会调用pcie_do_recovery函数尝试恢复,但卸载进程还未释放互斥锁,导致pcie_do_recovery流程中获取互斥锁失败,因此pcie_do_recovery进程也阻塞。
(b)如果继续执行热插操作,同样会因为获取pci互斥锁失败,使插盘初始化进程阻塞在pci_lock_rescan_remove函数,最终导致系统无法加载识别插入的磁盘。

(2)外设卸载耗时的测试数据:
[root@host vtu]# time echo 0 > power
[ 3697.552681] pci 0001:01:00.3: Removing from iommu group 39
[ 3699.780722] pci 0001:01:00.2: Removing from iommu group 37
[ 3701.496712] pci 0001:01:00.1: Removing from iommu group 34
[ 3701.508163] pci 0001:01:00.0: Removed PTP clock
[ 3703.300751] pci 0001:01:00.0: Removing from iommu group 29

real   0m9.044s
user   0m0.000s
sys    0m0.000s

(3)pcie热插拔函数调用流程分析:
a. 先按键热拔操作
pciehp_ist //pcie中断处理函数
	-->if (events & PCI_EXP_SLTSTA_ABP)
		-->pciehp_handle_button_press 
	-->if (events & DISABLE_SLOT)
		-->pciehp_handle_disable_request
			-->pciehp_disable_slot
				--> __pciehp_disable_slot
					-->remove_board
						-->pciehp_set_indicators //先设置电源指示灯熄灭
						-->pciehp_unconfigure_device //然后执行卸载
							-->pci_lock_rescan_remove
								...    ------------> 拔盘导致pcie bus error, 此时还没有释放pci互斥锁
							-->pci_unlock_rescan_remove


b. 然后热插操作
pciehp_ist
	-->if (events & (PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_DLLSC))
		-->pciehp_handle_presence_or_link_change
			-->link_active = pciehp_check_link_active(ctrl)
			-->case OFF_STATE
				-->ctrl_info(ctrl, "Slot(%s): Card present\n",slot_name(ctrl));
				-->pciehp_enable_slot
					--> __pciehp_enable_slot
						-->board_added
							-->pciehp_set_indicators //先设置电源指示灯常亮 
							-->pciehp_configure_device //然后设备初始化
								-->pci_lock_rescan_remove  -----------> 获取pci互斥锁失败
									...                    
								-->pci_unlock_rescan_remove



2. Version-Release number of selected component (if applicable):

3. How reproducible:
该问题必现

4.Steps to Reproduce:
(1)执行按键热拔操作,先按键,电源指示灯熄灭则立即拔盘;
(2)执行热插操作.

5. Actual results:
(1)按键热拔时,按键后,电源指示灯闪烁5秒熄灭,此时立即拔盘。系统日志会报错“PCIe Bus Error”,导致OS内卸载进程阻塞在i40e_remove函数,这时候卸载进程还未释放pci互斥锁(mutex)。与此同时,OS内会调用pcie_do_recovery函数尝试恢复设备,但卸载进程还未释放互斥锁,导致pcie_do_recovery流程中获取互斥锁失败,因此pcie_do_recovery进程也阻塞。

按键拔盘后系统日志报错“PCIe Bus Error”:
Dec 11 05:40:55 host kernel: [  525.057572] pcieport 000d:80:00.0: pciehp: Slot(3): Attention button pressed
Dec 11 05:40:55 host kernel: [  525.064640] pcieport 000d:80:00.0: pciehp: Slot(3): Powering off due to button press
Dec 11 05:40:59 host systemd[1]: Started Runs mimosa scheduler timed cpu-rt-runtime-task1.
Dec 11 05:40:59 host systemd[1]: cpu-rt-runtime-task1.service: Succeeded.
Dec 11 05:41:00 host kernel: [  530.080038] i40e 000d:81:00.3: removed PHC from enP13s3f3.
Dec 11 05:41:02 host kernel: [  531.602363] pcieport 000d:80:00.0: AER: Multiple Corrected error received: 000d:80:00.0
Dec 11 05:41:02 host kernel: [  531.653349] pcieport 000d:80:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
Dec 11 05:41:02 host kernel: [  531.662911] pcieport 000d:80:00.0:   device [1cf2:8301] error status/mask=00000001/00002000
Dec 11 05:41:02 host kernel: [  531.671263] pcieport 000d:80:00.0:    [ 0] RxErr                  (First)
Dec 11 05:41:02 host kernel: [  531.678066] pcieport 000d:80:00.0: DPC: containment event, status:0x0001 source:0x0000
Dec 11 05:41:02 host kernel: [  531.685986] pcieport 000d:80:00.0: DPC: unmasked uncorrectable error detected
Dec 11 05:41:02 host kernel: [  531.693122] pcieport 000d:80:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
Dec 11 05:41:02 host kernel: [  531.704251] pcieport 000d:80:00.0:   device [1cf2:8301] error status/mask=00004000/00108000
Dec 11 05:41:02 host kernel: [  531.712609] pcieport 000d:80:00.0:    [14] CmpltTO                (First)
Dec 11 05:41:02 host kernel: [  531.719413] i40e 000d:81:00.0: i40e_pci_error_detected: error 2
Dec 11 05:41:02 host kernel: [  531.744183] i40e 000d:81:00.0: VSI seid 400 Tx ring 383 disable timeout

卸载进程和pcie_do_recovery进程的阻塞堆栈信息如下:
[root@host vtu]# ps aux | grep 'D  '
root        1765  0.0  0.0      0     0 ?        D    05:32   0:00 [irq/310-pciehp]
root        1768  0.2  0.0      0     0 ?        D    05:32   0:01 [irq/311-pciehp]
root        1769  0.0  0.0      0     0 ?        D    05:32   0:00 [irq/311-pcie-dp]
[root@host vtu]# cat /proc/1768/stack
[<0>] __switch_to+0x108/0x168
[<0>] i40e_remove+0x7c/0x420 [i40e]
[<0>] pci_device_remove+0x44/0xe8
[<0>] device_release_driver_internal+0x110/0x1f0
[<0>] device_release_driver+0x1c/0x28
[<0>] pci_stop_bus_device+0x84/0xd8
[<0>] pci_stop_and_remove_bus_device+0x1c/0x30
[<0>] pciehp_unconfigure_device+0x90/0x158
[<0>] pciehp_disable_slot+0x84/0x150
[<0>] pciehp_handle_disable_request+0x50/0x70
[<0>] pciehp_ist+0x210/0x260
[<0>] irq_thread_fn+0x30/0xa0
[<0>] irq_thread+0x178/0x308
[<0>] kthread+0x114/0x118
[root@host vtu]# cat /proc/1769/stack
[<0>] __switch_to+0x108/0x168
[<0>] report_error_detected+0x30/0x120
[<0>] report_frozen_detected+0x1c/0x28
[<0>] pci_walk_bus+0x68/0xc0
[<0>] pci_walk_bridge+0x20/0x40
[<0>] pcie_do_recovery+0x168/0x290
[<0>] dpc_handler+0x34/0x48
[<0>] irq_thread_fn+0x30/0xa0
[<0>] irq_thread+0x178/0x308
[<0>] kthread+0x114/0x118

Dec 11 05:52:58 host kernel: [ 1247.647752] INFO: task irq/311-pcie-dp:1769 blocked for more than 606 seconds.
Dec 11 05:52:58 host kernel: [ 1247.654975]       Tainted: GF          OE     5.10.134-14.zncgsl6.aarch64 #1
Dec 11 05:52:58 host kernel: [ 1247.662020] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 11 05:52:58 host kernel: [ 1247.669842] task:irq/311-pcie-dp state:D stack:    0 pid: 1769 ppid:     2 flags:0x00000208
Dec 11 05:52:58 host kernel: [ 1247.678185] Call trace:
Dec 11 05:52:58 host kernel: [ 1247.680632]  __switch_to+0x108/0x168
Dec 11 05:52:58 host kernel: [ 1247.684203]  __schedule+0x2f0/0x7f8
Dec 11 05:52:58 host kernel: [ 1247.687685]  schedule+0x48/0xd0
Dec 11 05:52:58 host kernel: [ 1247.690816]  schedule_preempt_disabled+0x14/0x20
Dec 11 05:52:58 host kernel: [ 1247.695424]  __mutex_lock.isra.9+0x260/0x5a0
Dec 11 05:52:58 host kernel: [ 1247.699685]  __mutex_lock_slowpath+0x18/0x20
Dec 11 05:52:58 host kernel: [ 1247.703946]  mutex_lock+0x48/0x60
Dec 11 05:52:58 host kernel: [ 1247.707251]  report_error_detected+0x30/0x120
Dec 11 05:52:58 host kernel: [ 1247.711598]  report_frozen_detected+0x1c/0x28
Dec 11 05:52:58 host kernel: [ 1247.715949]  pci_walk_bus+0x68/0xc0
Dec 11 05:52:58 host kernel: [ 1247.719425]  pci_walk_bridge+0x20/0x40
Dec 11 05:52:58 host kernel: [ 1247.723165]  pcie_do_recovery+0x168/0x290
Dec 11 05:52:58 host kernel: [ 1247.727168]  dpc_handler+0x34/0x48
Dec 11 05:52:58 host kernel: [ 1247.730564]  irq_thread_fn+0x30/0xa0
Dec 11 05:52:58 host kernel: [ 1247.734133]  irq_thread+0x178/0x308
Dec 11 05:52:58 host kernel: [ 1247.737615]  kthread+0x114/0x118

(2)继续执行按键热插操作,同样会因为获取pci互斥锁失败,使插盘初始化进程阻塞在pci_lock_rescan_remove函数,最终导致系统无法加载识别插入的磁盘。系统日志中堆栈信息如下:

Dec 11 06:03:04 host kernel: [ 1853.855730] INFO: task irq/310-pciehp:1765 blocked for more than 606 seconds.
Dec 11 06:03:04 host kernel: [ 1853.862868]       Tainted: GF          OE     5.10.134-14.zncgsl6.aarch64 #1
Dec 11 06:03:04 host kernel: [ 1853.869910] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 11 06:03:04 host kernel: [ 1853.877731] task:irq/310-pciehp  state:D stack:    0 pid: 1765 ppid:     2 flags:0x00000208
Dec 11 06:03:04 host kernel: [ 1853.886075] Call trace:
Dec 11 06:03:04 host kernel: [ 1853.888521]  __switch_to+0x108/0x168
Dec 11 06:03:04 host kernel: [ 1853.892090]  __schedule+0x2f0/0x7f8
Dec 11 06:03:04 host kernel: [ 1853.895572]  schedule+0x48/0xd0
Dec 11 06:03:04 host kernel: [ 1853.898700]  schedule_preempt_disabled+0x14/0x20
Dec 11 06:03:04 host kernel: [ 1853.903310]  __mutex_lock.isra.9+0x260/0x5a0
Dec 11 06:03:04 host kernel: [ 1853.907575]  __mutex_lock_slowpath+0x18/0x20
Dec 11 06:03:04 host kernel: [ 1853.911838]  mutex_lock+0x48/0x60
Dec 11 06:03:04 host kernel: [ 1853.915142]  pci_lock_rescan_remove+0x20/0x28
Dec 11 06:03:04 host kernel: [ 1853.919493]  pciehp_configure_device+0x2c/0x140
Dec 11 06:03:04 host kernel: [ 1853.924017]  pciehp_handle_presence_or_link_change+0x200/0x3e8
Dec 11 06:03:04 host kernel: [ 1853.929842]  pciehp_ist+0x220/0x260
Dec 11 06:03:04 host kernel: [ 1853.933327]  irq_thread_fn+0x30/0xa0
Dec 11 06:03:04 host kernel: [ 1853.936897]  irq_thread+0x178/0x308
Dec 11 06:03:04 host kernel: [ 1853.940379]  kthread+0x114/0x118
Dec 11 06:03:04 host kernel: [ 1853.943600] INFO: task irq/311-pcie-dp:1769 blocked for more than 1212 seconds.
Dec 11 06:03:04 host kernel: [ 1853.950902]       Tainted: GF          OE     5.10.134-14.zncgsl6.aarch64 #1
Dec 11 06:03:04 host kernel: [ 1853.957942] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 11 06:03:04 host kernel: [ 1853.965761] task:irq/311-pcie-dp state:D stack:    0 pid: 1769 ppid:     2 flags:0x00000208
Dec 11 06:03:04 host kernel: [ 1853.974103] Call trace:
Dec 11 06:03:04 host kernel: [ 1853.976542]  __switch_to+0x108/0x168
Dec 11 06:03:04 host kernel: [ 1853.980109]  __schedule+0x2f0/0x7f8
Dec 11 06:03:04 host kernel: [ 1853.983591]  schedule+0x48/0xd0
Dec 11 06:03:04 host kernel: [ 1853.986720]  schedule_preempt_disabled+0x14/0x20
Dec 11 06:03:04 host kernel: [ 1853.991328]  __mutex_lock.isra.9+0x260/0x5a0
Dec 11 06:03:04 host kernel: [ 1853.995588]  __mutex_lock_slowpath+0x18/0x20
Dec 11 06:03:04 host kernel: [ 1853.999851]  mutex_lock+0x48/0x60
Dec 11 06:03:04 host kernel: [ 1854.003157]  report_error_detected+0x30/0x120
Dec 11 06:03:04 host kernel: [ 1854.007506]  report_frozen_detected+0x1c/0x28
Dec 11 06:03:04 host kernel: [ 1854.011856]  pci_walk_bus+0x68/0xc0
Dec 11 06:03:04 host kernel: [ 1854.015331]  pci_walk_bridge+0x20/0x40
Dec 11 06:03:04 host kernel: [ 1854.019071]  pcie_do_recovery+0x168/0x290
Dec 11 06:03:04 host kernel: [ 1854.023074]  dpc_handler+0x34/0x48
Dec 11 06:03:04 host kernel: [ 1854.026467]  irq_thread_fn+0x30/0xa0
Dec 11 06:03:04 host kernel: [ 1854.030034]  irq_thread+0x178/0x308
Dec 11 06:03:04 host kernel: [ 1854.033516]  kthread+0x114/0x118


6. Expected results:
按键热插拔操作成功执行,能够成功卸载并加载磁盘。


7. Additional info:
  龙蜥社区对nvme热插拔流程做了优化,更改了remove_board函数中灭灯(pciehp_set_indicators)和卸载(pciehp_unconfigure_device)的顺序,补丁链接为:
 https://gitee.com/anolis/cloud-kernel/commit/c58940f9cdb18b2ab386ef953f73b0345b2c3845
  回退上述补丁后,pcie按键热插拔测试验证通过。
Comment 1 小龙 admin 2024-04-25 19:59:16 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/3101
Comment 2 小龙 admin 2024-04-28 13:59:44 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/3103
Comment 3 小龙 admin 2024-05-31 15:18:20 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/3302
Comment 4 小龙 admin 2024-05-31 16:48:43 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/3303
Comment 5 小龙 admin 2024-06-12 10:20:46 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/3339