Created attachment 401 [details] 测试结果 原先使用centos7操作系统,部署微服务,然后通过修改limits.conf、sysctl.conf文件来进行高并发调优,现在迁移到龙蜥7.9上,进行相同的调优后,相同配置下龙蜥7.9并发量、稳定性都比centos7要低,性能低15%~50%,稳定性更差。 1、问题说明,测试应用性能,具体调用流程如下: 1.1、请求方将加密后的数据,通过接口调用应用服务; 1.2、应用服务将数据解密,将数据存入缓存; 2、测试环境(龙蜥) 2.1、架构如下:测试服务器上jmeter---->nginx负载均衡------>服务集群; 2.2、测试服务器:1台虚拟机,windows 2012,16核心、64G内存,jmeter5,1000并发; 2.3、Nginx服务器:1台虚拟机,CentOS 7.6,16核心、64G内存; 2.4、应用服务器:5台虚拟机,AnolisOS 7.9,16核心、64G内存,每台部署20个微服务; 2.5、网络环境:内网万兆网卡; 2.6、测试结果 并发:1000 最小:1ms 最大:8664ms 平均:36ms 事务:14532TPS 波动:大,看附件 3、测试环境(CentOS) 3.1、架构如下:测试服务器上jmeter---->nginx负载均衡------>服务集群; 3.2、测试服务器:1台虚拟机,windows 2012,16核心、64G内存,jmeter5,1000并发; 3.3、Nginx服务器:1台虚拟机,CentOS 7.6,16核心、64G内存; 3.4、应用服务器:5台虚拟机,CentOS 7.6,16核心、64G内存,每台部署20个微服务; 3.5、网络环境:内网万兆网卡; 3.6、测试结果 并发:1000 最小:1ms 最大:3323ms 平均:19ms 事务:16605TPS 波动:小,看附件
业务服务器为10台虚拟机
(In reply to lzs109 from comment #1) > 业务服务器为10台虚拟机 1. centos和龙蜥启动的虚拟机配置是否一致? 2. 5台centos应用服务器和5台龙蜥os应用服务器和nginx的网络拓扑是否一致? 2. 应用服务器centos和龙蜥os网络相关执行状态数据是否能够在附件里面提供一份(执行netstat -s)
1. centos和龙蜥启动的虚拟机配置是否一致? 配置相同,16核心CPU、64G内存、300G硬盘 2. 5台centos应用服务器和5台龙蜥os应用服务器和nginx的网络拓扑是否一致? 分别是10台虚拟机服务器,网路拓扑一致,测试机、Nginx相同,不同的是应用服务器的操作系统 3. 应用服务器centos和龙蜥os网络相关执行状态数据是否能够在附件里面提供一份(执行netstat -s) 龙蜥服务器: Ip: 3571488 total packets received 0 forwarded 0 incoming packets discarded 3571488 incoming packets delivered 3622111 requests sent out 4 outgoing packets dropped Icmp: 8 ICMP messages received 0 input ICMP message failed. ICMP input histogram: destination unreachable: 8 8 ICMP messages sent 0 ICMP messages failed ICMP output histogram: destination unreachable: 8 IcmpMsg: InType3: 8 OutType3: 8 Tcp: 1325 active connections openings 692643 passive connection openings 83 failed connection attempts 348 connection resets received 122 connections established 3568120 segments received 3623321 segments send out 74 segments retransmited 0 bad segments received. 103 resets sent Udp: 3348 packets received 8 packets to unknown port received. 0 packet receive errors 649 packets sent 0 receive buffer errors 0 send buffer errors UdpLite: TcpExt: 11 invalid SYN cookies received 3 resets received for embryonic SYN_RECV sockets 457322 TCP sockets finished time wait in fast timer 26921 delayed acks sent 420 delayed acks further delayed because of locked socket Quick ack mode was activated 59 times 59 times the listen queue of a socket overflowed 59 SYNs to LISTEN sockets dropped 2026 packets directly queued to recvmsg prequeue. 7183 bytes directly received in process context from prequeue 31602 packet headers predicted 1387314 acknowledgments not containing data payload received 28479 predicted acknowledgments 3 congestion windows recovered without slow start after partial ack 61 other TCP timeouts 11 times receiver scheduled too late for direct processing TCPSpuriousRTOs: 2 TCPBacklogDrop: 74 TCPTimeWaitOverflow: 191067 TCPRcvCoalesce: 7871 TCPOFOQueue: 8 TCPChallengeACK: 3 TCPSYNChallenge: 2 TCPSpuriousRtxHostQueues: 2 TCPAutoCorking: 3474 TCPSynRetrans: 50 TCPOrigDataSent: 1484632 TCPHystartTrainDetect: 1 TCPHystartTrainCwnd: 25 IpExt: InMcastPkts: 2879 OutMcastPkts: 174 InOctets: 769399718 OutOctets: 274717772 InMcastOctets: 296092 OutMcastOctets: 18410 InNoECTPkts: 3583813 InECT0Pkts: 3 CentOS服务器 Ip: 169312834 total packets received 0 forwarded 0 incoming packets discarded 169312833 incoming packets delivered 170155592 requests sent out 3 outgoing packets dropped 10 dropped because of missing route 1 fragments received ok 2 fragments created Icmp: 14 ICMP messages received 0 input ICMP message failed. ICMP input histogram: destination unreachable: 14 11 ICMP messages sent 0 ICMP messages failed ICMP output histogram: destination unreachable: 11 IcmpMsg: InType3: 14 OutType3: 11 Tcp: 50682 active connections openings 33405317 passive connection openings 19544 failed connection attempts 1 connection resets received 61 connections established 169297994 segments received 169887869 segments send out 573966 segments retransmited 1360 bad segments received. 157234 resets sent Udp: 14810 packets received 11 packets to unknown port received. 0 packet receive errors 15898 packets sent 0 receive buffer errors 0 send buffer errors UdpLite: TcpExt: 108250 invalid SYN cookies received 218 resets received for embryonic SYN_RECV sockets 4002549 TCP sockets finished time wait in fast timer 252718 delayed acks sent 76 delayed acks further delayed because of locked socket Quick ack mode was activated 6338 times 20457 packets directly queued to recvmsg prequeue. 1176 bytes directly in process context from backlog 1131178 bytes directly received in process context from prequeue 410688 packet headers predicted 3260 packets header predicted and directly queued to user 66931466 acknowledgments not containing data payload received 372800 predicted acknowledgments 38 times recovered from packet loss due to fast retransmit 10 congestion windows fully recovered without slow start 1993 congestion windows recovered without slow start after partial ack 1 timeouts after reno fast retransmit 241 timeouts in loss state 40 fast retransmits 111 retransmits in slow start 314059 other TCP timeouts 1 classic Reno fast retransmits failed 11 connections reset due to unexpected data 1 connections reset due to early user close 19640 connections aborted due to timeout TCPSpuriousRTOs: 1 TCPTimeWaitOverflow: 26741357 TCPRcvCoalesce: 78935 TCPOFOQueue: 917 TCPOFOMerge: 75 TCPChallengeACK: 1431 TCPSYNChallenge: 1431 TCPAutoCorking: 4553 TCPSynRetrans: 296522 TCPOrigDataSent: 68541696 IpExt: InNoRoutes: 1 InMcastPkts: 2 InBcastPkts: 2 InOctets: 35974997008 OutOctets: 12737175184 InMcastOctets: 72 InBcastOctets: 192 InNoECTPkts: 169312816 InECT0Pkts: 18
(In reply to lzs109 from comment #0) > Created attachment 401 [details] > 测试结果 > > 原先使用centos7操作系统,部署微服务,然后通过修改limits.conf、sysctl.conf文件来进行高并发调优,现在迁移到龙蜥7. > 9上,进行相同的调优后,相同配置下龙蜥7.9并发量、稳定性都比centos7要低,性能低15%~50%,稳定性更差。 > > 1、问题说明,测试应用性能,具体调用流程如下: > 1.1、请求方将加密后的数据,通过接口调用应用服务; > 1.2、应用服务将数据解密,将数据存入缓存; > > 2、测试环境(龙蜥) > 2.1、架构如下:测试服务器上jmeter---->nginx负载均衡------>服务集群; > 2.2、测试服务器:1台虚拟机,windows 2012,16核心、64G内存,jmeter5,1000并发; > 2.3、Nginx服务器:1台虚拟机,CentOS 7.6,16核心、64G内存; > 2.4、应用服务器:5台虚拟机,AnolisOS 7.9,16核心、64G内存,每台部署20个微服务; > 2.5、网络环境:内网万兆网卡; > 2.6、测试结果 > 并发:1000 > 最小:1ms > 最大:8664ms > 平均:36ms > 事务:14532TPS > 波动:大,看附件 > > 3、测试环境(CentOS) > 3.1、架构如下:测试服务器上jmeter---->nginx负载均衡------>服务集群; > 3.2、测试服务器:1台虚拟机,windows 2012,16核心、64G内存,jmeter5,1000并发; > 3.3、Nginx服务器:1台虚拟机,CentOS 7.6,16核心、64G内存; > 3.4、应用服务器:5台虚拟机,CentOS 7.6,16核心、64G内存,每台部署20个微服务; > 3.5、网络环境:内网万兆网卡; > 3.6、测试结果 > 并发:1000 > 最小:1ms > 最大:3323ms > 平均:19ms > 事务:16605TPS > 波动:小,看附件 有对比测试过 centos 7.9 吗? centos 7 和 anolis 7 的内核版本分别是啥? uname -r 看一下?
应用有两部分,一部分是解压,另一部分是存入缓存可以分别看一下哪块有性能下降 1,你们后端应用的解密算法是什么,可以比较一下两个环境的解密算法的性能 2,在两个测试环境用numactl 把应用服务绑定到相同的node上看看效果
性能不打标的问题已经查明,有两方面: ## 1. 应用服务器(java)性能不打标。 首先,测试环境是应用服务器采用的 anolis 7.9,对应的内核版本为:3.10.0-1160.76.1.0.1.an7.x86_64。对比的 centos 7 是 centos 7.6,内核版本为 3.10.0-693.el7.x86_64。 性能差的主要原因是 nginx 和后端之间采用的是短连接,大量短连接导致大量的 TIME_WAIT 状态的连接,打满了 net.ipv4.tcp_max_tw_buckets,导致无法建立新的连接。 解决方法是通过升级到 anolis 的 4.19.91-26.4.an7.x86_64 内核,这个内核有 anolis 自研的 timewait 状态的连接快速退出的能力,具体通过配置 sysctl -w net.ipv4.tcp_tw_timeout=3 将 timewait 状态连接的超时时间从 60s 降低到 3s。升级内核并修改配置后,性能问题解决。 ## 2. nginx 服务器性能不打标 只切换 nginx 不改应用服务器的情况下,测试发现性能同样不符合预期。 对比发现,anolis 的服务器上,额外安装了 libvirt,并且随着 libvirt 配置了 NAT 规则,大量的 nat 规则严重影响了短连接到性能,导致 anolis 的短连接性能不如 centos。 解决方法: 卸载 libvirt: yum remove libvirt* 删除 libvirt 所创建的网卡并删除相应的 nat 规则: ip link set dev virbr0 down brctl delbr virbr0 ip link del virbr0-nic iptables -F -t nat iptables -F
最后,总结一下: 短连接性能不好时,需要关注两方面: 1. iptables 规则,特别是 NAT 规则,确保没有相应规则,否则会影响性能 ```iptables -L -t nat iptables -L ``` 2. time_wait 状态的连接数是否达到了上线 sysctl -a | grep tcp_max_tw_buckets netstat -ant | grep -i time_wait | wc -l