[缺陷描述]: kernel-selftests:执行mptcp目录下mptcp_connect.sh用例概率fail,报错“MPTCP copyfd_io_poll: poll timed out (events: POLLIN 1, POLLOUT 0)”; 在物理机上比较容易复现,测试10次,差不多能有1-3次fail,虚拟机测试30次左右都pass 测试日志: # ./mptcp_connect.sh INFO: set ns3-6481858f-idnrtx dev ns3eth2: ethtool -K tso off gso off gro off INFO: set ns4-6481858f-idnrtx dev ns4eth3: ethtool -K tso off gso off Created /tmp/tmp.5N0g6MS91X (size 1898524 /tmp/tmp.5N0g6MS91X) containing data sent by client Created /tmp/tmp.2tH752FxIc (size 4223004 /tmp/tmp.2tH752FxIc) containing data sent by server New MPTCP socket can be blocked via sysctl [ OK ] setsockopt(..., TCP_ULP, "mptcp", ...) blocked [ OK ] INFO: validating network environment with pings INFO: Using loss of 0.07% delay 22 ms reorder 93% 31% with delay 5ms on ns3eth4 ns1 MPTCP -> ns1 (10.0.1.1:10000 ) MPTCP (duration 23ms) [ OK ] ns1 MPTCP -> ns1 (10.0.1.1:10001 ) TCP (duration 22ms) [ OK ] ns1 TCP -> ns1 (10.0.1.1:10002 ) MPTCP (duration 21ms) [ OK ] ns1 MPTCP -> ns1 (dead:beef:1::1:10003) MPTCP (duration 23ms) [ OK ] ns1 MPTCP -> ns1 (dead:beef:1::1:10004) TCP (duration 21ms) [ OK ] ns1 TCP -> ns1 (dead:beef:1::1:10005) MPTCP (duration 22ms) [ OK ] ns1 MPTCP -> ns2 (10.0.1.2:10006 ) MPTCP (duration 26ms) [ OK ] ns1 MPTCP -> ns2 (dead:beef:1::2:10007) MPTCP (duration 44ms) [ OK ] ns1 MPTCP -> ns2 (10.0.2.1:10008 ) MPTCP (duration 28ms) [ OK ] ns1 MPTCP -> ns2 (dead:beef:2::1:10009) MPTCP (duration 29ms) [ OK ] ns1 MPTCP -> ns3 (10.0.2.2:10010 ) MPTCP (duration 300ms) [ OK ] ns1 MPTCP -> ns3 (dead:beef:2::2:10011) MPTCP (duration 255ms) [ OK ] ns1 MPTCP -> ns3 (10.0.3.2:10012 ) MPTCP (duration 275ms) [ OK ] ns1 MPTCP -> ns3 (dead:beef:3::2:10013) MPTCP (duration 263ms) [ OK ] ns1 MPTCP -> ns4 (10.0.3.1:10014 ) MPTCP (duration 263ms) [ OK ] ns1 MPTCP -> ns4 (dead:beef:3::1:10015) MPTCP (duration 251ms) [ OK ] ns2 MPTCP -> ns1 (10.0.1.1:10016 ) MPTCP (duration 46ms) [ OK ] ns2 MPTCP -> ns1 (dead:beef:1::1:10017) MPTCP (duration 29ms) [ OK ] ns2 MPTCP -> ns3 (10.0.2.2:10018 ) MPTCP (duration 257ms) [ OK ] ns2 MPTCP -> ns3 (dead:beef:2::2:10019) MPTCP (duration 241ms) [ OK ] ns2 MPTCP -> ns3 (10.0.3.2:10020 ) MPTCP (duration 245ms) [ OK ] ns2 MPTCP -> ns3 (dead:beef:3::2:10021) MPTCP (duration 258ms) [ OK ] ns2 MPTCP -> ns4 (10.0.3.1:10022 ) MPTCP (duration 268ms) [ OK ] ns2 MPTCP -> ns4 (dead:beef:3::1:10023) MPTCP (duration 276ms) [ OK ] ns3 MPTCP -> ns1 (10.0.1.1:10024 ) MPTCP (duration 454ms) [ OK ] ns3 MPTCP -> ns1 (dead:beef:1::1:10025) MPTCP (duration 258ms) [ OK ] ns3 MPTCP -> ns2 (10.0.1.2:10026 ) MPTCP (duration 280ms) [ OK ] ns3 MPTCP -> ns2 (dead:beef:1::2:10027) MPTCP (duration 242ms) [ OK ] ns3 MPTCP -> ns2 (10.0.2.1:10028 ) MPTCP (duration 519ms) [ OK ] ns3 MPTCP -> ns2 (dead:beef:2::1:10029) MPTCP (duration 253ms) [ OK ] ns3 MPTCP -> ns4 (10.0.3.1:10030 ) MPTCP (duration 65ms) [ OK ] ns3 MPTCP -> ns4 (dead:beef:3::1:10031) MPTCP (duration 42ms) [ OK ] ns4 MPTCP -> ns1 (10.0.1.1:10032 ) MPTCP (duration 474ms) [ OK ] ns4 MPTCP -> ns1 (dead:beef:1::1:10033) MPTCP (duration 360ms) [ OK ] ns4 MPTCP -> ns2 (10.0.1.2:10034 ) MPTCP (duration 768ms) [ OK ] ns4 MPTCP -> ns2 (dead:beef:1::2:10035) MPTCP (duration 440ms) [ OK ] ns4 MPTCP -> ns2 (10.0.2.1:10036 ) MPTCP copyfd_io_poll: poll timed out (events: POLLIN 1, POLLOUT 0) (duration 30311ms) [ FAIL ] client exit code 2, server 0 netns ns2-6481858f-idnrtx socket stat for 10036: State Recv-Q Send-Q Local Address:Port Peer Address:Port Process TIME-WAIT 0 0 10.0.2.1:10036 10.0.3.1:34106 timer:(timewait,59sec,0) netns ns4-6481858f-idnrtx socket stat for 10036: State Recv-Q Send-Q Local Address:Port Peer Address:Port Process LAST-ACK 0 1 10.0.3.1:34106 10.0.2.1:10036 timer:(on,205ms,0) ts sack cubic wscale:7,7 rto:223 rtt:22.232/0.048 ato:40 mss:1448 pmtu:1500 rcvmss:1420 advmss:1448 cwnd:1385 bytes_sent:1898524 bytes_acked:1898525 bytes_received:4223005 segs_out:1812 segs_in:3747 data_segs_out:1392 data_segs_in:3116 send 721655272bps lastsnd:30122 lastrcv:30049 lastack:30049 pacing_rate 1443294312bps delivery_rate 135796920bps delivered:1393 busy:218ms unacked:1 reordering:4 reord_seen:3 rcv_rtt:22.88 rcv_space:14480 rcv_ssthresh:3143668 minrtt:22.017 tcp-ulp-mptcp flags:Mmec token:0000(id:0)/822a9d0c(id:0) seq:a080289ed129332 sfseq:406001 ssnoff:bdf081e2 maplen:101c ns4 MPTCP -> ns2 (dead:beef:2::1:10037) MPTCP (duration 992ms) [ OK ] ns4 MPTCP -> ns3 (10.0.2.2:10038 ) MPTCP (duration 45ms) [ OK ] ns4 MPTCP -> ns3 (dead:beef:2::2:10039) MPTCP (duration 32ms) [ OK ] ns4 MPTCP -> ns3 (10.0.3.2:10040 ) MPTCP (duration 46ms) [ OK ] ns4 MPTCP -> ns3 (dead:beef:3::2:10041) MPTCP (duration 56ms) [ OK ] Time: 46 seconds [环境信息]: 复现环境: anck 5.10 x86 物理机 复现概率: 必现 内核信息: # uname -r 5.10.134-9.git.05dbf5c52.an8.x86_64 操作系统信息: # cat /etc/os-release NAME="Anolis OS" VERSION="8.8" ID="anolis" ID_LIKE="rhel fedora centos" VERSION_ID="8.8" PLATFORM_ID="platform:an8" PRETTY_NAME="Anolis OS 8.8" ANSI_COLOR="0;31" HOME_URL="https://openanolis.cn/" cpu信息: # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 24 On-line CPU(s) list: 0-23 Thread(s) per core: 2 Core(s) per socket: 12 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel BIOS Vendor ID: Intel CPU family: 6 Model: 63 Model name: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz BIOS Model name: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Stepping: 2 CPU MHz: 2292.150 CPU max MHz: 2500.0000 CPU min MHz: 1200.0000 BogoMIPS: 4988.74 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 30720K NUMA node0 CPU(s): 0-23 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts md_clear flush_l1d 内存信息: # free -h total used free shared buff/cache available Mem: 62Gi 2.3Gi 57Gi 186Mi 3.4Gi 59Gi Swap: 2.0Gi 88Mi 1.9Gi [复现步骤]: 下载当前内核对应的kernel源码包 rpm -ivh xxx.src.rpm 默认安装到/root下 yum-builddep -y rpmbuild/SPECS/kernel.spec 自动安装前置依赖包,需要yum-utils rpmbuild -bp ./rpmbuild/SPECS/kernel.spec # 这个步骤会打相关的patch, 解压缩tar包,生成BUILD目录 cd rpmbuild/BUILD/kernel-xxx/linux-xxx/ cd /tools/testing/selftests/net/mptcp make 执行测试用例 ./mptcp_connect.sh [期望结果]: 用例pass [实际结果]: 用例fail [原因分析]: 上游也存在相似的问题 https://github.com/multipath-tcp/mptcp_net-next/issues/230
本地合入下面这个patch后,在同一个物理机上测试30次均pass https://patchwork.kernel.org/project/mptcp/patch/20221219075048.255811-6-imagedong@tencent.com/
问题明显,且已经找到相似patch,请开发同学帮忙确认。
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/1781
已经修复。PR: https://gitee.com/anolis/cloud-kernel/pulls/1781
在rc3内核的物理机执行30次均pass,问题解决,bug关闭 # uname -r 5.10.134-15_rc3.an8.x86_64 ###########Begin 30 test############### INFO: set ns3-64ab824a-xuw1HF dev ns3eth2: ethtool -K tso off gso off gro off INFO: set ns4-64ab824a-xuw1HF dev ns4eth3: ethtool -K gso off gro off Created /tmp/tmp.6Ft6UUKr5s (size 5508124 /tmp/tmp.6Ft6UUKr5s) containing data sent by client Created /tmp/tmp.ygsMekJDZv (size 8003612 /tmp/tmp.ygsMekJDZv) containing data sent by server New MPTCP socket can be blocked via sysctl [ OK ] setsockopt(..., TCP_ULP, "mptcp", ...) blocked [ OK ] INFO: validating network environment with pings INFO: Using loss of 0.32% delay 13 ms reorder 95% 58% with delay 3ms on ns3eth4 ns1 MPTCP -> ns1 (10.0.1.1:10000 ) MPTCP (duration 45ms) [ OK ] ns1 MPTCP -> ns1 (10.0.1.1:10001 ) TCP (duration 48ms) [ OK ] ns1 TCP -> ns1 (10.0.1.1:10002 ) MPTCP (duration 41ms) [ OK ] ns1 MPTCP -> ns1 (dead:beef:1::1:10003) MPTCP (duration 40ms) [ OK ] ns1 MPTCP -> ns1 (dead:beef:1::1:10004) TCP (duration 53ms) [ OK ] ns1 TCP -> ns1 (dead:beef:1::1:10005) MPTCP (duration 63ms) [ OK ] ns1 MPTCP -> ns2 (10.0.1.2:10006 ) MPTCP (duration 38ms) [ OK ] ns1 MPTCP -> ns2 (dead:beef:1::2:10007) MPTCP (duration 57ms) [ OK ] ns1 MPTCP -> ns2 (10.0.2.1:10008 ) MPTCP (duration 67ms) [ OK ] ns1 MPTCP -> ns2 (dead:beef:2::1:10009) MPTCP (duration 55ms) [ OK ] ns1 MPTCP -> ns3 (10.0.2.2:10010 ) MPTCP (duration 240ms) [ OK ] ns1 MPTCP -> ns3 (dead:beef:2::2:10011) MPTCP (duration 372ms) [ OK ] ns1 MPTCP -> ns3 (10.0.3.2:10012 ) MPTCP (duration 482ms) [ OK ] ns1 MPTCP -> ns3 (dead:beef:3::2:10013) MPTCP (duration 226ms) [ OK ] ns1 MPTCP -> ns4 (10.0.3.1:10014 ) MPTCP (duration 309ms) [ OK ] ns1 MPTCP -> ns4 (dead:beef:3::1:10015) MPTCP (duration 1945ms) [ OK ] ns2 MPTCP -> ns1 (10.0.1.1:10016 ) MPTCP (duration 43ms) [ OK ] ns2 MPTCP -> ns1 (dead:beef:1::1:10017) MPTCP (duration 43ms) [ OK ] ns2 MPTCP -> ns3 (10.0.2.2:10018 ) MPTCP (duration 1398ms) [ OK ] ns2 MPTCP -> ns3 (dead:beef:2::2:10019) MPTCP (duration 1714ms) [ OK ] ns2 MPTCP -> ns3 (10.0.3.2:10020 ) MPTCP (duration 234ms) [ OK ] ns2 MPTCP -> ns3 (dead:beef:3::2:10021) MPTCP (duration 287ms) [ OK ] ns2 MPTCP -> ns4 (10.0.3.1:10022 ) MPTCP (duration 209ms) [ OK ] ns2 MPTCP -> ns4 (dead:beef:3::1:10023) MPTCP (duration 224ms) [ OK ] ns3 MPTCP -> ns1 (10.0.1.1:10024 ) MPTCP (duration 269ms) [ OK ] ns3 MPTCP -> ns1 (dead:beef:1::1:10025) MPTCP (duration 1009ms) [ OK ] ns3 MPTCP -> ns2 (10.0.1.2:10026 ) MPTCP (duration 219ms) [ OK ] ns3 MPTCP -> ns2 (dead:beef:1::2:10027) MPTCP (duration 446ms) [ OK ] ns3 MPTCP -> ns2 (10.0.2.1:10028 ) MPTCP (duration 751ms) [ OK ] ns3 MPTCP -> ns2 (dead:beef:2::1:10029) MPTCP (duration 2206ms) [ OK ] ns3 MPTCP -> ns4 (10.0.3.1:10030 ) MPTCP (duration 44ms) [ OK ] ns3 MPTCP -> ns4 (dead:beef:3::1:10031) MPTCP (duration 44ms) [ OK ] ns4 MPTCP -> ns1 (10.0.1.1:10032 ) MPTCP (duration 189ms) [ OK ] ns4 MPTCP -> ns1 (dead:beef:1::1:10033) MPTCP (duration 367ms) [ OK ] ns4 MPTCP -> ns2 (10.0.1.2:10034 ) MPTCP (duration 2534ms) [ OK ] ns4 MPTCP -> ns2 (dead:beef:1::2:10035) MPTCP (duration 2108ms) [ OK ] ns4 MPTCP -> ns2 (10.0.2.1:10036 ) MPTCP (duration 1959ms) [ OK ] ns4 MPTCP -> ns2 (dead:beef:2::1:10037) MPTCP (duration 1990ms) [ OK ] ns4 MPTCP -> ns3 (10.0.2.2:10038 ) MPTCP (duration 46ms) [ OK ] ns4 MPTCP -> ns3 (dead:beef:2::2:10039) MPTCP (duration 43ms) [ OK ] ns4 MPTCP -> ns3 (10.0.3.2:10040 ) MPTCP (duration 43ms) [ OK ] ns4 MPTCP -> ns3 (dead:beef:3::2:10041) MPTCP (duration 44ms) [ OK ] Time: 28 seconds ###########End 30 test#################
见上面评论,验证OK,关闭bug