Bug 11608 - [ANCK 5.10] NFSD:Force all NFSv4.2 COPY requests to be synchronous
Summary: [ANCK 5.10] NFSD:Force all NFSv4.2 COPY requests to be synchronous
Status: NEW
Alias: None
Product: ANCK 5.10 Dev
Classification: ANCK
Component: fs (show other bugs) fs
Version: 5.10.y-14
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: zhaoqiang11
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-11-04 09:33 UTC by zhaoqiang11
Modified: 2024-11-13 15:45 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description zhaoqiang11 2024-11-04 09:33:37 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 1 zhaoqiang11 2024-11-04 11:38:46 UTC
Description of problem:
When the NFS client uses the 4.2 mounting protocol, the upper-layer application uses copy_file_range system calls to copy large files and enter 
the asynchronous process of nfs4_copy_file_range function. After receiving the client's rpc request, the NFS server creates a kernel asynchronous 
thread nfsd4_do_async_copy to complete the copy.If the NFS service is being stopped, there is a probability that the last NFSD kernel thread marks 
the asynchronous copy thread with a KTHREAD_SHOULD_STOP mark in advance, triggering the asynchronous thread nfsd4_do_async_copy fail to complete the 
reference count minus 1. This results in a leak of the nfsd4_copy->refcount reference count.

How reproducible:
1、In the kthread function, add a delay of 10 seconds to simulate the delay to wake up the asynchronous copy thread.

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 93755d413..bcbabd758 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -29,6 +29,7 @@
 #include <linux/numa.h>
 #include <linux/sched/isolation.h>
 #include <trace/events/sched.h>
+#include <linux/delay.h>
 
 
 static DEFINE_SPINLOCK(kthread_create_lock);
@@ -307,6 +308,13 @@ static int kthread(void *_create)
        schedule_preempt_disabled();
        preempt_enable();
 
+       struct task_struct *current_task = current;
+       const char *current_comm = current_task->comm;
+
+       if (strstr(current_comm, "copy thread") != NULL) {
+               msleep(10000);
+       }
+
        ret = -EINTR;
        if (!test_bit(KTHREAD_SHOULD_STOP, &self->flags)) {
                cgroup_kthread_ready();


2、
The same NFS client uses copy_file_range system calls to copy files from one NFS server to another NFS server.

The same NFS client mounts two different export directories on the same NFS server and uses the copy_file_range system call 
to copy the two different export directories on the same NFS server. At this point, the NFS service is being restarted periodically.

this is the nfs server.
[root@nfs-server ~]# cat /etc/exports
/nfs_server_test1 192.168.122.0/24(rw,async,no_root_squash)
/nfs_server_test2 192.168.122.0/24(rw,async,no_root_squash)

[root@nfs-server ~]# df -Th
Filesystem           Type      Size  Used Avail Use% Mounted on
devtmpfs             devtmpfs  3.8G     0  3.8G   0% /dev
tmpfs                tmpfs     3.8G     0  3.8G   0% /dev/shm
tmpfs                tmpfs     3.8G  900K  3.8G   1% /run
tmpfs                tmpfs     3.8G     0  3.8G   0% /sys/fs/cgroup
/dev/mapper/ncl-root xfs        19G   13G  5.6G  70% /
tmpfs                tmpfs     3.8G  4.0K  3.8G   1% /tmp
/dev/vda1            xfs      1014M  328M  687M  33% /boot
tmpfs                tmpfs     778M     0  778M   0% /run/user/0
/dev/vdb             ext4      2.0G  3.9M  1.8G   1% /data
/dev/vdc1            xfs        50G  398M   50G   1% /nfs_server_test1
/dev/vdc2            xfs        50G  399M   50G   1% /nfs_server_test2

this is the nfs client.
[root@nfs-client ~]# df -Th
Filesystem                        Type      Size  Used Avail Use% Mounted on
devtmpfs                          devtmpfs  3.8G     0  3.8G   0% /dev
tmpfs                             tmpfs     3.9G     0  3.9G   0% /dev/shm
tmpfs                             tmpfs     3.9G  836K  3.9G   1% /run
tmpfs                             tmpfs     3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/mapper/ncl-root              xfs        18G   14G  4.1G  78% /
tmpfs                             tmpfs     3.9G  4.0K  3.9G   1% /tmp
/dev/vda1                         xfs      1014M  326M  689M  33% /boot
tmpfs                             tmpfs     779M     0  779M   0% /run/user/0
192.168.122.170:/nfs_server_test1 nfs4       50G  397M   50G   1% /home/nfs-client-42-test1
192.168.122.170:/nfs_server_test2 nfs4       50G  399M   50G   1% /home/nfs-client-42-test2


Actual results:
The NFSD thread softlock occurs on the NFS server, and the value of nfsd4_copy.refcount is 1 and cannot be reduced to 0 by the crash tool, 
an endless loop will occur in the nfsd4_shutdown_copy function.

The complete function call chain is as follows:
kthread
   nfsd
      nfsd_destroy
          svc_shutdown_net //serv->sv_ops->svo_shutdown=nfsd_last_thread
              nfsd_last_thread
                  nfsd_shutdown_net
                      nfs4_state_shutdown_net
                          nfs4_state_destroy_net
                             destroy_client
                                __destroy_client
                                    nfsd4_shutdown_copy   //while ((copy = nfsd4_get_copy(clp)) != NULL)
                                        nfsd4_stop_copy
                                            nfs4_put_copy

void nfsd4_shutdown_copy(struct nfs4_client *clp)
{
	struct nfsd4_copy *copy;

	while ((copy = nfsd4_get_copy(clp)) != NULL) //This will appear to be an endless loop.
		nfsd4_stop_copy(copy);
}

//The crash parsed nfsd4_copy.refcount value is 1.
crash> nfsd4_copy.refcount ffff8f6e54cd3000
  refcount = {
    refs = {
      counter = 1
    }
  }

Oct 31 09:49:29 nfs-server kernel: watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [nfsd:3134]
Oct 31 09:49:29 nfs-server kernel: Modules linked in: rpcsec_gss_krb5 xt_comment ip_tables xt_CHECKSUM xt_recent xt_MASQUERADE ipt_REJECT nf_reject_ipv4 xt_state xt_conntrack nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter target_core_user uio target_core_pscsi nft_compat target_core_file tun intel_rapl_msr nf_tables target_core_iblock nfnetlink bridge intel_rapl_common stp isst_if_mbox_msr llc isst_if_common iscsi_target_mod rfkill nfit target_core_mod kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl joydev mousedev psmouse pcspkr virtio_balloon i2c_piix4 bdev nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sr_mod cirrus cdrom sd_mod drm_kms_helper t10_pi ata_generic syscopyarea sg sysfillrect sysimgblt ata_piix fb_sys_fops drm crc32c_intel serio_raw libata virtio_net net_failover virtio_scsi i2c_core failover dm_mirror dm_region_hash dm_log dm_mod mds
Oct 31 09:49:29 nfs-server kernel: CPU: 12 PID: 3134 Comm: nfsd Kdump: loaded Tainted: G             L    5.10.134-14.zncgsl6.x86_64 #1
Oct 31 09:49:29 nfs-server kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Oct 31 09:49:29 nfs-server kernel: RIP: 0010:nfs4_put_copy+0x19/0x40 [nfsd]
Oct 31 09:49:29 nfs-server kernel: Code: 6f 01 00 4c 8b 0c 24 e9 33 fd ff ff 0f 1f 44 00 00 0f 1f 44 00 00 48 8d 97 c0 05 00 00 b8 ff ff ff ff f0 0f c1 87 c0 05 00 00 <83> f8 01 74 09 85 c0 7e 0a c3 cc cc cc cc e9 c4 db c4 cb be 03 00
Oct 31 09:49:29 nfs-server kernel: RSP: 0018:ffffbc2840a43da0 EFLAGS: 00000213
Oct 31 09:49:29 nfs-server kernel: RAX: 0000000000000002 RBX: ffff9e1946737a60 RCX: 0000000000000000
Oct 31 09:49:29 nfs-server kernel: RDX: ffff9e19473045c0 RSI: ffffbc2840a43d40 RDI: ffff9e1947304000
Oct 31 09:49:29 nfs-server kernel: RBP: ffff9e1946737fd8 R08: 0000075e37e3c005 R09: 00000436db1ca967
Oct 31 09:49:29 nfs-server kernel: R10: 0000000000000000 R11: 00000436db1ca967 R12: ffff9e1946737fe8
Oct 31 09:49:29 nfs-server kernel: R13: 0000000000000001 R14: ffff9e1947304000 R15: ffff9e1946737ab8
Oct 31 09:49:29 nfs-server kernel: FS:  0000000000000000(0000) GS:ffff9e1a77c00000(0000) knlGS:0000000000000000
Oct 31 09:49:29 nfs-server kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 31 09:49:29 nfs-server kernel: CR2: 0000563bc36e1be8 CR3: 00000001bea10003 CR4: 0000000000770ee0
Oct 31 09:49:29 nfs-server kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct 31 09:49:29 nfs-server kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Oct 31 09:49:29 nfs-server kernel: PKRU: 55555554
Oct 31 09:49:29 nfs-server kernel: Call Trace:
Oct 31 09:49:29 nfs-server kernel: nfsd4_shutdown_copy+0x5f/0xa0 [nfsd]
Oct 31 09:49:29 nfs-server kernel: __destroy_client+0x1a6/0x200 [nfsd]
Oct 31 09:49:29 nfs-server kernel: nfs4_state_shutdown_net+0x156/0x240 [nfsd]
Oct 31 09:49:29 nfs-server kernel: ? nfsd_destroy+0x60/0x60 [nfsd]
Oct 31 09:49:29 nfs-server kernel: nfsd_shutdown_net+0x30/0x60 [nfsd]
Oct 31 09:49:29 nfs-server kernel: nfsd_last_thread+0xd5/0xf0 [nfsd]
Oct 31 09:49:29 nfs-server kernel: ? nfsd_destroy+0x60/0x60 [nfsd]
Oct 31 09:49:29 nfs-server kernel: nfsd_destroy+0x3c/0x60 [nfsd]
Oct 31 09:49:29 nfs-server kernel: nfsd+0x182/0x1b0 [nfsd]
Oct 31 09:49:29 nfs-server kernel: kthread+0x13e/0x160
Oct 31 09:49:29 nfs-server kernel: ? __kthread_cancel_work+0x50/0x50
Oct 31 09:49:29 nfs-server kernel: ret_from_fork+0x1f/0x30
Comment 2 小龙 admin 2024-11-04 15:16:09 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/4067
Comment 3 小龙 admin 2024-11-13 15:45:48 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/4098