Bug 5890 - xfs: AGF buf deadlock between xfs_create and xfs_fs_destroy_inode
Summary: xfs: AGF buf deadlock between xfs_create and xfs_fs_destroy_inode
Status: RESOLVED FIXED
Alias: None
Product: ANCK 5.10 Dev
Classification: ANCK
Component: fs (show other bugs) fs
Version: unspecified
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: Joseph Qi
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-07-17 10:58 UTC by lisa
Modified: 2023-08-22 11:05 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description lisa 2023-07-17 10:58:31 UTC
Description of problem:
xfs_create() get an ino but loop due to inode I_FREEING, and take AGF lock just when finobt releasing block. 
meanwhile, xfs_fs_destroy_inode() is freeing the same inode: got I_FREEING status on VFS --> put it back to finobt already by xfs_difree() --> waiting AGF lock (holding by xfs_create) in xfs_trans_commit(). 
that's why deadlock.

xfs_create looping:
PID: 1894063 TASK: ffff954f494dc500 CPU: 5 COMMAND: postgres*
#O (ffffa141 ca34f920] schedule at ffffffff9ca58505
#1 (ffffa141ca34f9b0] schedule at ffffffff9ca5899€
#2 (ffffa141 ca34f9c0] schedule timeout at ffffffff9ca5c027
#3 [ffffa141ca34fa48] xfs_iget at ffffffffe1137b4f (xfs]	xfs_iget_cache_hit->	-> igrab(inode)
#4 (ffffa141 ca34fb001 xfs_ialloc at ffffffffc1140ab5 [xfs]
#5 [ffffa141ca34fb80] xfs_dir_ialloc at ffffffffc1142bfc (xfs]
#6 [ffffa141ca34f10] xfs_create at ffffffffe1142fc8 [xfs]
#7 [ffffa141 ca34fca0] xfs_generic_create at ffffffffc1140229 [xfs]
...

crash> inode.i_state ffff954f76496130 -x     
  i_state = 0x60
                  #define I_FREEING       (1 << 5)  
                  #define I_CLEAR         (1 << 6)
crash> xfs_inode.i_flags ffff954f76496000 -x
  i_flags = 0x40
                  #define XFS_IDIRTY_RELEASE  (1 << 6)


xfs_fs_destroy_inode waiting AGF lock:
PID: 202276 TASK: ffff954d142/0000 CPU:2 COMMAND: postgres*
90 (ffffa141c12638d0) schedule at ffffffff9ca58505
#1 (ffffa141c1263960) schedule at ffffffff9ca5899c
#2 (ffffa141c1263970) schedule timeout at ffffffff9caSc0a9
*3 (ffffa141c1263988)
down at ffffffff9caSaba5
44 (ffffa141c1263a58) down at ffffffff9c146d6b
#5 (ffffa141c1263a70) xfs_buf_lock at ffffffffc112c3dc Ixfs]
#6 (ffffa141c1 263a80) xfs_buf_find at ffffffffc112c83d [xfs]
#7 (ffffa141c1263b18) xfs_buf_get_map at ffffffffe112cb3c (xfs]
#8 (ffffa141c1263b70) xfs_buf_read_map at ffffffffc112d175 [xfs]
#9 (ffffa141c1263bc8] xfs_trans_read_buf map at ffffffffc1 16404a (xfs)
#10 (ffffa141c1263c28] xfs_read_agf at ffffffffc10e1c44 (xís)
#11 (ffffa141c1263c80) xfs_alloc_read_agf at ffffffffc10e1d0a [xfs]
#12 (ffffa141c1263cb0] xfs_agfl_free_finish item at ffffffffc115a45a (xfs]
#13 (ffffa141c1263d00] xfs_defer_finish_noroll at ffffffffe110257e Ixfs]
#14 (Hfffa141c1263d68) xfs_trans_commit at ffffffffe1150581 [xfs]
#15 (ffffa141c1263da8] xfs_inactive_free at ffffffffc1144084 (xfs)
#16 (ffffa141c1263dd8) xfs_inactive at ffffffffc11441f2 [xfs)
#17 (ffffa141c1263dfO] xfs_fs_destroy_inode at ffffffffc114d489 [xfs]
#18 (ffffa141€1263e101 destroy_inode at ffffffff9c3838a8
#19 (ffffa141c1263e28) dentry_kill at ffffffff9c37f5d5
#20 [ffffa141c1263e48] dput at ffffffff9c3800ab
#21 (ffffa141c1263e70) do renameat2 at ffffffff9c376a8b
#22 (ffffa141c1263f38) sys_rename at ffffffff9c376cdc
#23 Iffffa141c1263f40) do syscall 64 at ffffffff9ca4a4c0
424 [ffffa141c1263f50] entry SYSCALL 64 after hwframe at ffffffff9cc00099

process of xfs_create taking AGF lock:
0xffffffffc039bc57 : xfs_read_agf+0x97/0x110 [xfs]
 0xffffffffc039bd0a : xfs_alloc_read_agf+0x3a/0x180 [xfs]
 0xffffffffc039c231 : xfs_alloc_fix_freelist+0x3e1/0x460 [xfs]
 0xffffffffc039c7e4 : xfs_free_extent_fix_freelist+0x64/0xb0 [xfs]
 0xffffffffc039c888 : __xfs_free_extent+0x58/0x180 [xfs]
 0xffffffffc03b01de : xfs_btree_free_block+0x1e/0xb0 [xfs]
 0xffffffffc03b02fb : xfs_btree_kill_root+0x8b/0xb0 [xfs]
 0xffffffffc03b4f73 : xfs_btree_delrec+0x923/0xe60 [xfs]
 0xffffffffc03b6263 : xfs_btree_delete+0x43/0x110 [xfs]
 0xffffffffc03cbeb3 : xfs_dialloc_ag+0x143/0x260 [xfs]
 0xffffffffc03ccce1 : xfs_dialloc+0x61/0x2b0 [xfs]
 0xffffffffc03faa23 : xfs_ialloc+0x83/0x530 [xfs]
 0xffffffffc03fcbfc : xfs_dir_ialloc+0x6c/0x220 [xfs]
 0xffffffffc03fcfc8 : xfs_create+0x218/0x570 [xfs]

Version-Release number of selected component (if applicable):


How reproducible:
It's difficult to reproduce because two conditions need to be met at the same time: 1. The INO allocated when xfs_create() corresponds to a deleted file; 2. This creation action just causes the block change of finobt.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 1 小龙 admin 2023-07-17 15:12:33 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/1905
Comment 2 小龙 admin 2023-08-03 14:33:36 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/1990
Comment 3 Joseph Qi alibaba_cloud_group 2023-08-22 11:05:46 UTC
(In reply to 小龙 from comment #2)
> The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/1990

merged