zhaoxin inclusion category: feature ------------------- On USB4 support platform, plug TBT3 dock into typec port, then plug an ext4 format Udisk into the TBT3 dock. Then put system into Hibernation after device is enumerated. If plug out TBT3 dock during sleep state then wakeup system, the system may randomly encounter deadlocks during restore phase of hibernation. And restore cannot be successfully completed finally. More explanations about deadlocks are as follows: This TBT3 dock that consists of a PCIe switch and a PCIe endpoint. RP-- 00.0-+ [8086:15ef] Upstream Port +-02.0-+ [8086:15ef] Downstream Port | +-00.0 [8086:15f0] Thunderbolt 3 USB Controller +-04.0 [8086:15ef] Downstream Port During the resume process, the PCI driver detected that the switch under RP was disconnected, so it started hot unplugging processing, which will remove the entire PCIe hierarchy behind RP. The removal process is as follows: pciehp_unconfigure_device pci_stop_and_remove_bus_device pci_stop_dev ... xhci_pci_remove usb_remove_hcd usb_disconnect usb_disable_device usb_unbind_interface usb_stor_disconnect quiesce_and_remove_host scsi_remove_host scsi_forget_host __scsi_remove_device sd_remove del_gendisk invalidate_partition fsync_bdev sync_filesystem __sync_filesystem sb->s_op->sync_fs ext4_sync_fs blkdev_issue_flush submit_bio_wait submit_bio generic_make_request blk_queue_enter Finally, it will stuck on the blk_queue_enter function and will never return, As request queue not mark dying and only pm request is allowed. On the other hand, udisk and sd device resume also need to get device_lock which has already been obtained during the remove process. Therefore, a deadlock will occur here. To fix this issue, when deleting a SCSI device, if it is detected that the device was suprise removed, mark the device's request queue as dying. At the same time, add callback function to usb storage driver to identify surprise remove. Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/5799