线上问题:
客户端不能推送数据到服务端。
排查:
- ping ip或者telnet port全是正常的,不奏效。
- 通过wireshark抓取报文查看,发现一个奇怪现象是窗口不固定,但是整体趋势是逐渐减小,直到为0. 服务端报文如下:
15:41:29.680256 IP 110.89.84.123.1950 > 110.89.84.126.52021: Flags [.], ack 107925, win 38, options [nop,nop,TS val 1604471956 ecr 1606303303], length 0 0x0000: 0022 462c a12f d067 e50f e893 0800 4500 ."F,./.g......E. 0x0010: 0034 a79b 4000 4006 d417 0b0c 547b 0b0c .4..@.@.....T{.. 0x0020: 547e 079e cb35 0c6f 535c 531b 640c 8010 T~...5.oSS.d... 0x0030: 0026 8383 0000 0101 080a 5fa2 4c94 5fbe .&........_.L._. 0x0040: 3e47 >G15:41:29.719474 IP 110.89.84.123.1950 > 110.89.84.126.52021: Flags [.], ack 112269, win 5, options [nop,nop,TS val 1604471996 ecr 1606303303], length 0 0x0000: 0022 462c a12f d067 e50f e893 0800 4500 ."F,./.g......E. 0x0010: 0034 a79c 4000 4006 d416 0b0c 547b 0b0c .4..@.@.....T{.. 0x0020: 547e 079e cb35 0c6f 535c 531b 7504 8010 T~...5.oSS.u... 0x0030: 0005 7284 0000 0101 080a 5fa2 4cbc 5fbe ..r......._.L._. 0x0040: 3e47 >G15:41:29.934875 IP 110.89.84.126.52021 > 110.89.84.123.1950: Flags [P.], seq 112269:112909, ack 88, win 115, options [nop,nop,TS val 1606303559 ecr 1604471996], length 640 0x0000: d067 e50f e893 0022 462c a12f 0800 4500 .g....."F,./..E. 0x0010: 02b4 5a89 4000 4006 1eaa 0b0c 547e 0b0c ..Z.@.@.....T~.. 0x0020: 547b cb35 079e 531b 7504 0c6f 535c 8018 T{.5..S.u..oS.. 0x0030: 0073 c1b7 0000 0101 080a 5fbe 3f47 5fa2 .s........_.?G_.15:41:29.975487 IP 110.89.84.123.1950 > 110.89.84.126.52021: Flags [.], ack 116613, win 10, options [nop,nop,TS val 1604472252 ecr 1606303559], length 0 0x0000: 0022 462c a12f d067 e50f e893 0800 4500 ."F,./.g......E. 0x0010: 0034 a79e 4000 4006 d414 0b0c 547b 0b0c .4..@.@.....T{.. 0x0020: 547e 079e cb35 0c6f 535c 531b 85fc 8010 T~...5.oSS..... 0x0030: 000a 5f87 0000 0101 080a 5fa2 4dbc 5fbe .._......._.M._. 0x0040: 3f47 ?G15:41:30.191875 IP 110.89.84.126.52021 > 110.89.84.123.1950: Flags [P.], seq 116613:117893, ack 88, win 115, options [nop,nop,TS val 1606303816 ecr 1604472252], length 1280 0x0000: d067 e50f e893 0022 462c a12f 0800 4500 .g....."F,./..E. 0x0010: 0534 5a8d 4000 4006 1c26 0b0c 547e 0b0c .4Z.@.@..&..T~.. 0x0020: 547b cb35 079e 531b 85fc 0c6f 535c 8018 T{.5..S....oS.. 0x0030: 0073 c437 0000 0101 080a 5fbe 4048 5fa2 .s.7......_.@H_. 0x0040: 4dbc 2037 3435 6634 3361 3238 3334 6534 M..745f43a2834e4 0x0050: 6465 3462 3561 3862 6630 3031 3333 6564 de4b5a8bf00133ed 0x0060: 6462 3401 0d01 0400 0000 5308 0b10 0000 db4.......S..... a315:41:30.192523 IP 110.89.84.123.1950 > 110.89.84.126.52021: Flags [.], ack 117893, win 0, options [nop,nop,TS val 1604472469 ecr 1606303816], length 0 0x0000: 0022 462c a12f d067 e50f e893 0800 4500 ."F,./.g......E. 0x0010: 0034 a79f 4000 4006 d413 0b0c 547b 0b0c .4..@.@.....T{.. 0x0020: 547e 079e cb35 0c6f 535c 531b 8afc 8010 T~...5.oSS..... 0x0030: 0000 58b7 0000 0101 080a 5fa2 4e95 5fbe ..X......._.N._. 0x0040: 4048 @H15:41:30.406872 IP 110.89.84.126.52021 > 110.89.84.123.1950: Flags [.], ack 88, win 115, options [nop,nop,TS val 1606304031 ecr 1604472469], length 0 0x0000: d067 e50f e893 0022 462c a12f 0800 4500 .g....."F,./..E. 0x0010: 0034 5a8e 4000 4006 2125 0b0c 547e 0b0c .4Z.@.@.!%..T~.. 0x0020: 547b cb35 079e 531b 8afb 0c6f 535c 8010 T{.5..S....oS.. 0x0030: 0073 bf37 0000 0101 080a 5fbe 411f 5fa2 .s.7......_.A._. 0x0040: 4e95 N.15:41:30.407143 IP 110.89.84.123.1950 > 110.89.84.126.52021: Flags [.], ack 117893, win 0, options [nop,nop,TS val 1604472683 ecr 1606303816], length 0 0x0000: 0022 462c a12f d067 e50f e893 0800 4500 ."F,./.g......E. 0x0010: 0034 a7a0 4000 4006 d412 0b0c 547b 0b0c .4..@.@.....T{.. 0x0020: 547e 079e cb35 0c6f 535c 531b 8afc 8010 T~...5.oSS..... 0x0030: 0000 57e1 0000 0101 080a 5fa2 4f6b 5fbe ..W......._.Ok_. 0x0040: 4048
- 至此服务端一直回复服务端窗口为0,导致客户端数据无法回传到服务端。
- 通过 netstat -ano查看服务端TCP内核的发送和接受缓冲区,发现服务端接受缓冲一定字节,但一直不能发送。
[root@xdja tomcat]# netstat -antActive Internet connections (servers and established)Proto Recv-Q Send-Q Local Address Foreign Address Statetcp 0 0 110.89.84.123:14468 110.89.84.33:1950 ESTABLISHEDtcp 0 0 :::1950 :::* LISTENtcp 115005 0 ::ffff:110.89.84.123:1950 ::ffff:110.89.84.126:52021 ESTABLISHED
结论:
由此可以判断,客户端一直在发数据,但是服务端处理数据整体慢于客户端发送数据,导致服务端数据积压。
解决方案:
后台修改成异步处理,如果收到TCP消息,先缓存到业务中,然后启动线程消费。
推荐阅读: