Bug 2644 - net: Fix gro aggregation for udp encaps with zero csum
Summary: net: Fix gro aggregation for udp encaps with zero csum
Status: RESOLVED FIXED
Alias: None
Product: ANCK 4.19 Dev
Classification: ANCK
Component: net (show other bugs) net
Version: unspecified
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: XuanZhuo
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-10-27 19:05 UTC by LeoLiu-oc
Modified: 2023-01-17 15:12 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description LeoLiu-oc zhaoxin_group 2022-10-27 19:05:02 UTC
Description of problem:
this patch significantly improves performance for udp encaps with zero csum.  When the csum of UDP header is zero, GRO aggregation does not occur on the phys dev(it should be handled here), but is deferred until inner packet processing. Considering the performance, GRO should be implemented as early as possible.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
创建vxlan隧道,直接进行iperf3测试,就可以对比出网络性能的提升。
测试环境:
针对未打此patch和打过此patch的内核分别进行iperf3 tcp性能测试,测试环境:
主板	CHX002
OS	CentOS 7.7.1908
Kernel	3.10.0-1062.18.1 (zx-patch-v3.0.9.5)
网卡	Intel x520(82599)

两台机器直连或者通过万兆交换机连接在一起,正确配置IP地址,测试保证网络是相通的,这里以下面IP为例:
host A:	192.168.1.1(eth1)
host B: 	192.168.1.2(eth1)
注:这里eth1代表网络设备,需要根据实际情况确认,下面的配置命令要用到
1. 网卡中断绑定:
#ethtool -L eth1 combined 1
#cat /proc/interrupts | grep eth1
(这里显示的第一个数字即为网卡对应中断号,例如:92)
# echo 1 > /proc/irq/92/smp_affinity
以上步骤需要在hostA和hostB都执行一遍

2. 创建vxlan:
	host A 端:
# ip link add vxlan01 type vxlan id 2 remote 192.168.1.2 local 192.168.1.1 dstport 5766 dev eth1
# ip addr add 10.0.2.1/24  dev vxlan01
# ip link set vxlan01 up
	host B端:
# ip link add vxlan01 type vxlan id 2 remote 192.168.1.1 local 192.168.1.2 dstport 5766 dev eth1
# ip addr add 10.0.2.2/24  dev vxlan01
# ip link set vxlan01 up

3. 测试:
host A:  iperf3 -A 1  -s
host B:  iperf3 -A 1  -c 10.0.2.1  -t 60

对比打patch前后的测试数据,可以看到测试成绩明显提升, 从3.x Gbits/sec提升到7.x Gbits/sec


Actual results:


Expected results:


Additional info:
Comment 1 dust.li alibaba_cloud_group 2022-10-28 09:46:03 UTC
(In reply to LeoLiu-oc from comment #0)
> Description of problem:
> this patch significantly improves performance for udp encaps with zero csum.
> When the csum of UDP header is zero, GRO aggregation does not occur on the
> phys dev(it should be handled here), but is deferred until inner packet
> processing. Considering the performance, GRO should be implemented as early
> as possible.
> 
> Version-Release number of selected component (if applicable):
> 
> 
> How reproducible:
> 
> 
> Steps to Reproduce:
> 创建vxlan隧道,直接进行iperf3测试,就可以对比出网络性能的提升。
> 测试环境:
> 针对未打此patch和打过此patch的内核分别进行iperf3 tcp性能测试,测试环境:
> 主板	CHX002
> OS	CentOS 7.7.1908
> Kernel	3.10.0-1062.18.1 (zx-patch-v3.0.9.5)
> 网卡	Intel x520(82599)
> 
> 两台机器直连或者通过万兆交换机连接在一起,正确配置IP地址,测试保证网络是相通的,这里以下面IP为例:
> host A:	192.168.1.1(eth1)
> host B: 	192.168.1.2(eth1)
> 注:这里eth1代表网络设备,需要根据实际情况确认,下面的配置命令要用到
> 1. 网卡中断绑定:
> #ethtool -L eth1 combined 1
> #cat /proc/interrupts | grep eth1
> (这里显示的第一个数字即为网卡对应中断号,例如:92)
> # echo 1 > /proc/irq/92/smp_affinity
> 以上步骤需要在hostA和hostB都执行一遍
> 
> 2. 创建vxlan:
> 	host A 端:
> # ip link add vxlan01 type vxlan id 2 remote 192.168.1.2 local 192.168.1.1
> dstport 5766 dev eth1
> # ip addr add 10.0.2.1/24  dev vxlan01
> # ip link set vxlan01 up
> 	host B端:
> # ip link add vxlan01 type vxlan id 2 remote 192.168.1.1 local 192.168.1.2
> dstport 5766 dev eth1
> # ip addr add 10.0.2.2/24  dev vxlan01
> # ip link set vxlan01 up
> 
> 3. 测试:
> host A:  iperf3 -A 1  -s
> host B:  iperf3 -A 1  -c 10.0.2.1  -t 60
> 
> 对比打patch前后的测试数据,可以看到测试成绩明显提升, 从3.x Gbits/sec提升到7.x Gbits/sec
> 
> 
> Actual results:
> 
> 
> Expected results:
> 
> 
> Additional info:

非常感谢!

看上述测试里面是在 centos 3.10 内核上测的,有在 anck-4.19 上测试验证过吗?
Comment 2 maqiao alibaba_cloud_group 2023-01-17 15:12:12 UTC
merged: https://gitee.com/anolis/cloud-kernel/pulls/806