Bug 4078 - possible deadlock in swiotlb_do_find_slots
Summary: possible deadlock in swiotlb_do_find_slots
Status: RESOLVED FIXED
Alias: None
Product: Anolis OS 8
Classification: Anolis OS
Component: kernel - anck-5.10 (show other bugs) kernel - anck-5.10
Version: ---
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: yuguorui96
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-15 20:41 UTC by yuguorui96
Modified: 2023-02-28 19:38 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description yuguorui96 alibaba_cloud_group 2023-02-15 20:41:15 UTC
Description of problem:
机密VM需要使用swiotlb以进行IO相关操作,由于NVME要求min_align_mask,会使得如下代码的while循环无法退出.

static int swiotlb_do_find_slots(struct device *dev, int area_index,
		phys_addr_t orig_addr, size_t alloc_size,
		unsigned int alloc_align_mask)
{
...
	do {
		slot_index = slot_base + index;

		if (orig_addr &&
		    (slot_addr(tbl_dma_addr, slot_index) &
		     iotlb_align_mask) != (orig_addr & iotlb_align_mask)) {
			index = wrap_area_index(mem, index + 1);
			continue;
		}

		/*
		 * If we find a slot that indicates we have 'nslots' number of
		 * contiguous buffers, we allocate the buffers from that slot
		 * and mark the entries as '0' indicating unavailable.
		 */
		if (!iommu_is_span_boundary(slot_index, nslots,
					    nr_slots(tbl_dma_addr),
					    max_slots)) {
			if (mem->slots[slot_index].list >= nslots)
				goto found;
		}
		index = wrap_area_index(mem, index + stride);
	} while (index != wrap);

一般情况下,如果swiotlb充足,index = wrap_area_index(mem, index + 1)的逻辑是没问题的,它会很快拿走一个tlb_cache然后释放锁;
但是如果非常不幸,它没有找到的话,它就会一直在这个loop里,直至找到才能exit,也就是说它是没有其他的退出路径的;

但是问题是,它一直在loop里会一直持有锁,所以别人是没法归还的,所以形成了死锁。

Version-Release number of selected component (if applicable):


How reproducible: 机密计算环境下(TDX/SEV),进行IO压力测试
mkfs.xfs /dev/nvme1n1

Actual results:
deadlock

Expected results:

Additional info:
Backtraces:
[10199.924391] RIP: 0010:swiotlb_do_find_slots+0x1fe/0x3e0
[10199.924403] Call Trace:
[10199.924404]
[10199.924405] swiotlb_tbl_map_single+0xec/0x1f0
[10199.924407] swiotlb_map+0x5c/0x260
[10199.924409] ? nvme_pci_setup_prps+0x1ed/0x340
[10199.924411] dma_direct_map_page+0x12e/0x1c0
[10199.924413] nvme_map_data+0x304/0x370
[10199.924415] nvme_prep_rq.part.0+0x31/0x120
[10199.924417] nvme_queue_rq+0x77/0x1f0
[10199.924420] blk_mq_dispatch_rq_list+0x17e/0x670
[10199.924422] __blk_mq_sched_dispatch_requests+0x129/0x140
[10199.924424] blk_mq_sched_dispatch_requests+0x34/0x60
[10199.924426] __blk_mq_run_hw_queue+0x91/0xb0
[10199.924428] process_one_work+0x1df/0x3b0
[10199.924430] worker_thread+0x49/0x2e0
[10199.924432] ? rescuer_thread+0x390/0x390
[10199.924433] kthread+0xe5/0x110
[10199.924435] ? kthread_complete_and_exit+0x20/0x20
[10199.924436] ret_from_fork+0x1f/0x30
[10199.924439]
Comment 1 maqiao alibaba_cloud_group 2023-02-28 19:38:00 UTC
merged: https://gitee.com/anolis/cloud-kernel/pulls/1191