Description of problem: kernel-selftest测试套net.psock_snd.sh用例执行fail,日志如下: # ./psock_snd.sh dgram tx: 128 rx: 142 rx: 100 OK dgram bind tx: 128 rx: 142 rx: 100 OK raw tx: 142 rx: 142 rx: 100 OK raw bind tx: 142 rx: 142 rx: 100 OK raw qdisc bypass tx: 142 rx: 142 rx: 100 OK raw vlan tx: 146 rx: 100 OK raw vnet hdr tx: 152 rx: 142 rx: 100 OK raw csum_off tx: 152 rx: 142 rx: 100 OK raw csum_off with bad offset (expected to fail) ./psock_snd: write: Invalid argument raw min size tx: 42 rx: 0 OK raw mtu size tx: 1514 rx: 1472 OK raw mtu size + 1 (expected to fail) ./psock_snd: write: Message too long raw vlan mtu size + 1 (expected to fail) ./psock_snd: write: Message too long dgram mtu size tx: 1500 rx: 1472 OK dgram mtu size + 1 (expected to fail) ./psock_snd: write: Message too long raw truncate hlen (expected to fail: does not arrive) tx: 14 ./psock_snd: recv: Resource temporarily unavailable raw truncate hlen - 1 (expected to fail: EINVAL) ./psock_snd: write: Invalid argument raw gso min size tx: 1525 rx: 1473 OK raw gso min size - 1 (expected to fail) tx: 1524 rx: 1472 OK Version-Release number of selected component (if applicable): How reproducible: 必现 Steps to Reproduce: 下载当前内核对应的kernel源码包 rpm -ivh xxx.src.rpm 默认安装到/root下 yum-builddep -y rpmbuild/SPECS/kernel.spec 自动安装前置依赖包,需要yum-utils rpmbuild -bp ./rpmbuild/SPECS/kernel.spec # 这个步骤会打相关的patch, 解压缩tar包,生成BUILD目录 cd rpmbuild/BUILD/kernel-xxx/linux-xxx/ 接下来就可以编译测试了 cd /tools/testing/selftests/bpf/ make 执行相关用例: cd kernel-selftests/net ./psock_snd.sh Actual results: 用例执行fail Expected results: 用例执行pass Additional info: 版本信息: cat /etc/os-release NAME="Anolis OS" VERSION="8.6" ID="anolis" ID_LIKE="rhel fedora centos" VERSION_ID="8.6" PLATFORM_ID="platform:an8" PRETTY_NAME="Anolis OS 8.6" ANSI_COLOR="0;31" HOME_URL="https://openanolis.cn/" 内核信息: # uname -r 5.10.134-259.git.a5a0ae7bf175.an8.aarch64 内存信息: # free -h total used free shared buff/cache available Mem: 15Gi 262Mi 14Gi 11Mi 754Mi 14Gi Swap: 0B 0B 0B cpu信息: # lscpu Architecture: aarch64 Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 1 Core(s) per cluster: 4 Socket(s): 1 Cluster(s): 1 NUMA node(s): 1 Vendor ID: ARM BIOS Vendor ID: Alibaba Cloud Model: 1 Model name: Neoverse-N1 BIOS Model name: virt-rhel7.6.0 Stepping: r3p1 BogoMIPS: 50.00 NUMA node0 CPU(s): 0-3 Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
已知问题,结论是: 在 4.19 内核上,没有做任何包大小是否可以做 gso 的检查,每次都设置 skb_shinfo(skb)->gso_size = gso_size,并尝试进行 GSO,由于这个包不符合 GSO 的条件,导致这个包实际上被丢掉了,所以接收的时候报错,测试通过。 而 5.10 和最新的内核上,这里增加了一个检查,如果当前 skb 的大小不符合 gso 的条件,则不配置 skb_shinfo(skb)->gso_size,也就是这个包不会尝试 gso,所以这个包收发都会成功,也就是这个检查导致了这个 psock_snd 的 case ("./in_netns.sh ./psock_snd -v -c -g -l 1472")失败。 这个额外的检查兼容了更多的上层错误,对于上层的错误使用,内核同样可以处理好,并且不会再返回错误给上层。实际上并没有什么问题。 所以这个 case 失败应该并不会有什么实际的影响。 删除该 case 的 patch 已经提交到社区,待合入社区主线后回合。
anolis23 x86 ecs环境,社区nightly kernel-selftests测试该case有同样问题 -------------- [root@qibo-anolis23-nightly-func-x86-1 net]# ./psock_snd.sh dgram tx: 128 rx: 142 rx: 100 OK dgram bind tx: 128 rx: 142 rx: 100 OK raw tx: 142 rx: 142 rx: 100 OK raw bind tx: 142 rx: 142 rx: 100 OK raw qdisc bypass tx: 142 rx: 142 rx: 100 OK raw vlan tx: 146 rx: 100 OK raw vnet hdr tx: 152 rx: 142 rx: 100 OK raw csum_off tx: 152 rx: 142 rx: 100 OK raw csum_off with bad offset (expected to fail) ./psock_snd: write: Invalid argument raw min size tx: 42 rx: 0 OK raw mtu size tx: 1514 rx: 1472 OK raw mtu size + 1 (expected to fail) ./psock_snd: write: Message too long raw vlan mtu size + 1 (expected to fail) ./psock_snd: write: Message too long dgram mtu size tx: 1500 rx: 1472 OK dgram mtu size + 1 (expected to fail) ./psock_snd: write: Message too long raw truncate hlen (expected to fail: does not arrive) tx: 14 ./psock_snd: recv: Resource temporarily unavailable raw truncate hlen - 1 (expected to fail: EINVAL) ./psock_snd: write: Invalid argument raw gso min size tx: 1525 rx: 1473 OK raw gso min size - 1 (expected to fail) tx: 1524 rx: 1472 OK [root@qibo-anolis23-nightly-func-x86-1 net]# uname -r 5.10.134-1.git.2ed1510fd4be.an23.x86_64 [root@qibo-anolis23-nightly-func-x86-1 net]# cat /etc/anolis-release Anolis OS release 23
Anolis23 arm环境有相同问题: [root@qibo-anolis23-nightly-func-arm-1 ~]# uname -r 5.10.134-125.git.21e574ab8b57.an23.aarch64 [root@qibo-anolis23-nightly-func-arm-1 ~]# cat /etc/os-release NAME="Anolis OS" VERSION="23" ID="anolis" VERSION_ID="23" PLATFORM_ID="platform:an23" PRETTY_NAME="Anolis OS 23" ANSI_COLOR="0;31" HOME_URL="https://openanolis.cn/" BUG_REPORT_URL="https://bugzilla.openanolis.cn/"