Bug 5811 - mlx5_core Mellanox ConnectX-5 ping has a large latency
Summary: mlx5_core Mellanox ConnectX-5 ping has a large latency
Status: NEW
Alias: None
Product: Anolis OS 8
Classification: Anolis OS
Component: kernel - anck-4.19 (show other bugs) kernel - anck-4.19
Version: 8.2
Hardware: x86_64 Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: wuhao
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-07-10 10:59 UTC by djhwyh
Modified: 2023-09-21 15:25 UTC (History)
1 user (show)

See Also:


Attachments
ping latency (61.80 KB, image/png)
2023-07-10 10:59 UTC, djhwyh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description djhwyh 2023-07-10 10:59:23 UTC
Created attachment 836 [details]
ping latency

Description of problem:
不同IDC机房中的,很多服务器都有该问题,具体是同一个接入交换机下的两台龙蜥服务器互相ping,一段时间后,可能出现ping延时达几百或上千毫秒,正常是小于1毫秒的。

Version-Release number of selected component (if applicable):
操作系统名称:Anolis OS 8.2 QU1,内核:4.19.91-24.8.an8.x86_64
网卡驱动和固件信息举例:

ethtool -i eth0

driver: mlx5_core

version: 5.0-0

firmware-version: 16.32.1010 (MT_0000000248)

expansion-rom-version:

bus-info: 0000:1a:00.0

How reproducible:
同一个接入交换机下的两台龙蜥服务器互相ping,服务器里是容器业务。


Actual results:
ping延迟结果见下面附件。

Expected results:


Additional info:
Comment 1 maqiao alibaba_cloud_group 2023-07-10 11:27:29 UTC
通过pingtrace定界分析,目前发现问题是client把包给到网卡驱动,到server的网卡驱动收到包,这个过程中有明显的延迟,怀疑是驱动或者交换机的问题
Comment 2 wuhao alibaba_cloud_group 2023-09-21 15:25:15 UTC
换了 mellanox 官网 5.8 的驱动,并对 4.19 内核进行了适配,问题已修复