背景&&现象:
内部开发环境OS为centos6.8 x64, 请求第三方接口非常缓慢,应用报超时错误。
问题分析和解决:
同事通过curl 命令判断是解析时间慢.
curl访问 http://www.baidu.com 都正常,无此现象。
以上现象可以稳定重现。
curl的命令显示问题出在域名解析上。同事将/etc/resolv.conf的nameserver调整为223.5.5.5后,故障现象消除。同事严重怀疑自建DNS的问题,看表象确实象是自建DNS存在问题,但考虑到大多数情况下问题根本原因和表象相差很远,需要谨慎排证。先抓包:
对比发现,在curl发起http请求的时候,客户端同时发起了ipv4,ipv6的解析请求,www.baidu.com的ipv4,ipv6解析请求均响应很快,而thirdwx.qlogo.cn的ipv6解析请求响应很慢。ipv6解析请求超时导致整个解析耗时过长因而接口响应缓慢。从现象上看,其他公司的AAAA域名解析都没有问题,就thirdwx.qlogo.cn存在这种问题。先查看thirdwx.qlogo.cn的权威dns的ip, 用dig查看其A,AAAA和其他类型记录:
代码语言:javascript复制[root@dev_bao_shops ~]# time dig thirdwx.qlogo.cn A @123.151.66.83
; <<>> DiG 9.15.1 <<>> thirdwx.qlogo.cn A @123.151.66.83
;; global options: cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53154
;; flags: qr aa rd ad; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: cf9fb55d0d033f5a (echoed)
;; QUESTION SECTION:
;thirdwx.qlogo.cn. IN A
;; ANSWER SECTION:
thirdwx.qlogo.cn. 600 IN A 183.36.108.14
thirdwx.qlogo.cn. 600 IN A 183.36.108.15
thirdwx.qlogo.cn. 600 IN A 183.36.108.126
thirdwx.qlogo.cn. 600 IN A 183.36.108.13
;; Query time: 31 msec
;; SERVER: 123.151.66.83#53(123.151.66.83)
;; WHEN: Thu Jul 11 13:21:26 CST 2019
;; MSG SIZE rcvd: 121
[root@dev_bao_shops ~]#
[root@dev_bao_shops ~]# dig thirdwx.qlogo.cn AAAA @123.151.66.83
; <<>> DiG 9.15.1 <<>> thirdwx.qlogo.cn AAAA @123.151.66.83
;; global options: cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30789
;; flags: qr aa rd ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 23cd5c1dcdb0d369 (echoed)
;; QUESTION SECTION:
;thirdwx.qlogo.cn. IN AAAA
;; ANSWER SECTION:
thirdwx.qlogo.cn. 600 IN CNAME cwx.qlogo.cn.
;; Query time: 36 msec
;; SERVER: 123.151.66.83#53(123.151.66.83)
;; WHEN: Thu Jul 11 13:21:30 CST 2019
;; MSG SIZE rcvd: 75
[root@dev_bao_shops ~]# dig thirdwx.qlogo.cn cname @123.151.66.83
; <<>> DiG 9.15.1 <<>> thirdwx.qlogo.cn cname @123.151.66.83
;; global options: cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25600
;; flags: qr aa rd ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 3b4a59fbabde1ad7 (echoed)
;; QUESTION SECTION:
;thirdwx.qlogo.cn. IN CNAME
;; ANSWER SECTION:
thirdwx.qlogo.cn. 600 IN CNAME cwx.qlogo.cn.
;; Query time: 35 msec
;; SERVER: 123.151.66.83#53(123.151.66.83)
;; WHEN: Fri Jul 12 20:41:43 CST 2019
;; MSG SIZE rcvd: 75
解析response的flags都是aa, 说明是权威解析 。根据RFC 1034(http://tools.ietf.org/pdf/rfc1034)章节3.6.2中描述:
If a CNAME RR is present at a node, no other data should be present; this ensures that the data for a canonical name and its aliases cannot be different.
RFC描述cname对其他的解析记录类型是互斥的。
qlogo.cn权威DNS的thirdwx.qlogo.cn同时存在着CNAME,A记录,只能推测qlogo.cn的权威DNS自研的,并且没有严格遵守RFC1034。此处缓慢的原因极有可能跟qlogo.cn权威DNS有关系。【至少在开源bind, dnsmasq上无法实现】,观察AAAA的返回结果:
thirdwx.qlogo.cn CNAME cwx.qlogo.cn.
resolver会继续递归查询cwx.qlogo.cn, 继续dig AAAA查询cwx.qlogo.cn
代码语言:javascript复制[root@dev_bao_shops ~]# dig cwx.qlogo.cn. AAAA @123.151.66.83
; <<>> DiG 9.15.1 <<>> cwx.qlogo.cn. AAAA @123.151.66.83
;; global options: cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41366
;; flags: qr aa rd ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: b0a4c4c7de0ffd87 (echoed)
;; QUESTION SECTION:
;cwx.qlogo.cn. IN AAAA
;; AUTHORITY SECTION:
qlogo.cn. 43200 IN SOA ns1.qq.com. webmaster.qq.com. 1273457866 300 600 86400 300
;; Query time: 34 msec
;; SERVER: 123.151.66.83#53(123.151.66.83)
;; WHEN: Thu Jul 11 15:08:28 CST 2019
;; MSG SIZE rcvd: 109
直接通过其权威DNS查询,没有发现故障。
向公共DNS发起 AAAA解析请求,观察结果:
代码语言:javascript复制[root@dev_bao_shops ~]# dig cwx.qlogo.cn AAAA @223.5.5.5
; <<>> DiG 9.15.1 <<>> cwx.qlogo.cn AAAA @223.5.5.5
;; global options: cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 5524
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;cwx.qlogo.cn. IN AAAA
;; Query time: 12 msec
;; SERVER: 223.5.5.5#53(223.5.5.5)
;; WHEN: Thu Jul 11 15:14:57 CST 2019
;; MSG SIZE rcvd: 30
[root@localhost ~]# dig AAAA cwx.qlogo.cn @114.114.114.114
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.47.rc1.el6 <<>> AAAA cwx.qlogo.cn @114.114.114.114
;; global options: cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 63403
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;cwx.qlogo.cn. IN AAAA
;; Query time: 17 msec
;; SERVER: 114.114.114.114#53(114.114.114.114)
;; WHEN: Thu Jul 11 16:00:59 2019
;; MSG SIZE rcvd: 30
通过公共DNS的解析状态都为:SERVFAIL。 到此初步的原因清楚了:
先描述一下故障环境下dns架构:
bind缓存/递归DNS ---> 办公区域控DNS ---> 114.114.114.114公共DNS ----> 域名的权威DNS。
SERVFAIL状态在bind中默认是不会命中nagetive cache,但是会有1s左右的缓存,域控DNS具有同样的行为(不会cache住SERVFAIL), 公共DNS对SERVFAIL 缓存的时间稍长一点(各家的实现都稍许差异), cwx.qlogo.cn每一次AAAA解析请求都会发起一次AAAA递归查询,
bind--->域控dns--->公共DNS114.114.114.114 -->权威dns, 导致解析的总体时间较长,从而影响到第三方接口的响应耗时。而正常A,AAAA,CNAME的response(status非SERVFAIL状态)被各级的DNS缓存住(包括negative cache), 从而避免了每次递归解析,因而第三方域名(如:www.baidu.com)解析较快。
接下来定位递归查询status:SERVFAIL的原因。因为bind的递归查询状态和公共DNS递归查询状态均是SERVFAIL,所以可以用bind来定位(只能以RFC文档,开源代码一窥究竟了)。
将本机bind的转发全部关闭,启用自身递归,开启dbug,发起cwx.qlogo.cn AAAA解析。
代码语言:javascript复制[root@dev_bao_shops ~]# time dig cwx.qlogo.cn AAAA @127.0.0.1
; <<>> DiG 9.15.1 <<>> cwx.qlogo.cn AAAA @127.0.0.1
;; global options: cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 62958
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;cwx.qlogo.cn. IN AAAA
;; Query time: 1457 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Jul 11 15:36:44 CST 2019
;; MSG SIZE rcvd: 41
10-May-2019 18:14:57.667 resolver: fctx 0x7f382102d010(cwx.qlogo.cn/AAAA): noanswer_response
10-May-2019 18:14:57.667 resolver: log_ns_ttl: fctx 0x7f382102d010: noanswer_response: cwx.qlogo.cn (in 'cwx.qlogo.cn'?): 1 86400
10-May-2019 18:14:57.667 resolver: DNS format error from 203.205.144.156#53 resolving cwx.qlogo.cn/AAAA for client 192.168.94.21#53139: Name qlogo.cn (SOA) not subdomain of zone cwx.qlogo.cn -- invalid response
下载对应版本的source code, 根据报错的日志,很快定位了代码文件和代码段
lib/dns/resolver.c
代码语言:javascript复制 * Trigger lookups for DNS nameservers.
*/
if (negative_response && message->rcode == dns_rcode_noerror &&
fctx->type == dns_rdatatype_ds && soa_name != NULL &&
dns_name_equal(soa_name, qname) &&
!dns_name_equal(qname, dns_rootname))
return (DNS_R_CHASEDSSERVERS);
/*
* Did we find anything?
*/
if (!negative_response && ns_name == NULL) {
/*
* Nope.
*/
if (oqname != NULL) {
/*
* We've already got a partial CNAME/DNAME chain,
* and haven't found else anything useful here, but
* no error has occurred since we have an answer.
*/
return (ISC_R_SUCCESS);
} else {
/*
* The responder is insane.
*/
if (save_name == NULL) {
log_formerr(fctx, "invalid response");
return (DNS_R_FORMERR);
}
if (!dns_name_issubdomain(save_name, &fctx->domain)) {
char nbuf[DNS_NAME_FORMATSIZE];
char dbuf[DNS_NAME_FORMATSIZE];
char tbuf[DNS_RDATATYPE_FORMATSIZE];
dns_rdatatype_format(save_type, tbuf,
sizeof(tbuf));
dns_name_format(save_name, nbuf, sizeof(nbuf));
dns_name_format(&fctx->domain, dbuf,
sizeof(dbuf));
log_formerr(fctx, "Name %s (%s) not subdomain"
" of zone %s -- invalid response",
nbuf, tbuf, dbuf);
} else {
log_formerr(fctx, "invalid response");
}
return (DNS_R_FORMERR);
}
}
这段代码较长,没有全部贴出。bind报错就是在resolve.c中的no_anwser_response函数中。
RFC2308的response no data描述(同时看NXDOMAIN ,NOERROR两2种状态约束条件,可以加快理解代码逻辑), 其中response no data 有4种类型,当前故障的response类型属于type 2(这种情况下,会读取SOA中的NS)。
(https://tools.ietf.org/html/rfc2308 2.2节)
具体分析一下域名ns glue的情况:
代码语言:javascript复制[root@ops-ntp2 ~]# dig cwx.qlogo.cn trace
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.68.rc1.el6 <<>> cwx.qlogo.cn trace
;; global options: cmd
. 544 IN NS m.root-servers.net.
. 544 IN NS c.root-servers.net.
. 544 IN NS k.root-servers.net.
. 544 IN NS d.root-servers.net.
. 544 IN NS f.root-servers.net.
. 544 IN NS l.root-servers.net.
. 544 IN NS b.root-servers.net.
. 544 IN NS i.root-servers.net.
. 544 IN NS g.root-servers.net.
. 544 IN NS h.root-servers.net.
. 544 IN NS e.root-servers.net.
. 544 IN NS j.root-servers.net.
. 544 IN NS a.root-servers.net.
;; Received 508 bytes from 10.0.0.91#53(10.0.0.91) in 6 ms
cn. 172800 IN NS c.dns.cn.
cn. 172800 IN NS g.dns.cn.
cn. 172800 IN NS b.dns.cn.
cn. 172800 IN NS ns.cernet.net.
cn. 172800 IN NS e.dns.cn.
cn. 172800 IN NS f.dns.cn.
cn. 172800 IN NS a.dns.cn.
cn. 172800 IN NS d.dns.cn.
;; Received 357 bytes from 198.41.0.4#53(198.41.0.4) in 516 ms
qlogo.cn. 86400 IN NS ns4.qq.com.
qlogo.cn. 86400 IN NS ns3.qq.com.
qlogo.cn. 86400 IN NS ns1.qq.com.
qlogo.cn. 86400 IN NS ns2.qq.com.
;; Received 108 bytes from 203.119.26.1#53(203.119.26.1) in 4188 ms
cwx.qlogo.cn. 86400 IN NS ns-tel1.qq.com.
cwx.qlogo.cn. 86400 IN NS ns-tel2.qq.com.
;; Received 144 bytes from 101.89.19.165#53(101.89.19.165) in 322 ms
cwx.qlogo.cn. 300 IN A 182.254.104.16
;; Received 46 bytes from 183.2.186.153#53(183.2.186.153) in 30 m
从dig trace看得出qlogo.cn和cwx.qlogo.cn属于2个不同的授权域(注意不是域名), 其中qlogo.cn是父域,cwx.qlogo.cn属于子域, qlogo.cn授权cwx.qlogo.cn的NS(即glue记录)为: ns-tel1.qq.com
代码语言:javascript复制[root@ops-ntp2 ~]# dig ns cwx.qlogo.cn @ns-tel1.qq.com
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.68.rc1.el6 <<>> ns cwx.qlogo.cn @ns-tel1.qq.com
;; global options: cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37177
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;cwx.qlogo.cn. IN NS
;; AUTHORITY SECTION:
qlogo.cn. 43200 IN SOA ns1.qq.com. webmaster.qq.com. 1273457866 300 600 86400 300
;; Query time: 36 msec
;; SERVER: 123.151.66.83#53(123.151.66.83)
;; WHEN: Fri Jul 12 17:33:34 2019
;; MSG SIZE rcvd: 86
通过cwx.qlogo.cn的权威DNS:ns-tel1.qq.com查看cwx.qlogo.cn的NS记录, 从输出结果看,子域cwx.qlogo.cn没有添加自身的NS记录。并且response authority section的SOA 中NS地址是父域的NS地址(ns1.qq.com), 这是有问题的,qlogo.cn的NS是ns1.qq.com, qlogo.cn授权子域cwx.qlogo.cn 到ns-tel1.qq.com, 然而子域cwx.qlogo.cn的SOA中的NS仍然指向父域NS(ns1.qq.com), 造成递归resolver解析异常(参考RFC1912 2.8节)
: qlogo.cn (SOA) not subdomain of zone cwx.qlogo.cn.
从而导致SERVFAIL。按照这个处理流程,cwx.qlogo.cn的A解析也可能会有这样的问题,但实际过程中并没有。这个是因为其权威dns存在着A记录,A解析请求不会走到no_answer_response这个处理逻辑,即使cwx.qlogo.cn这个域上没有任何类型的记录,也不会造成异常,因为这种情况下会走NXDOMAIN的处理逻辑,仍然到不了no_answer_response, 而NXDOMAIN和NOERROR状态的response都能被各级的DNS缓存住,因此不会造成性能问题。
所以要达到以上场景至少要满足:
- cwx.qlogo.cn域上存在其他类型的解析记录【不能有AAAA】
- cwx.qlogo.cn的SOA中的NS地址配置错误
- 父域qlogo.cn授权子域cwx.qlogo.cn到不同权威DNS, 同时子域的权威DNS上没有配置自身NS记录
由于qlogo.cn的权威DNS是第三公司自研实现,难以窥测具体实现,因此以开源bind作为例子进行复现,跟踪,定位。
解决办法:
一. (以案例中的bind为例) DNS服务器端调整, cwx.qlogo.cn配置NS地址:(如:ns1-tel1.qq.com),或者修改cwx.qlogo.cn的SOA中的ns地址为子域的NS:(ns1-tel1.qq.com)。
二 . 客户端方面调整相关参数,但只能缓解。
1. 调整dns客户端的行为,在/etc/resolv.conf中添加:
options timeout:1 attempts:1 rotate single-request-reopen
single-request-reopen解释:
代码语言:javascript复制single-request-reopen (since glibc 2.9)
Sets RES_SNGLKUPREOP in _res.options. The resolver
uses the same socket for the A and AAAA requests. Some
hardware mistakenly sends back only one reply. When
that happens the client system will sit and wait for
the second reply. Turning this option on changes this
behavior so that if two requests from the same port are
not handled correctly it will close the socket and open
a new one before sending the second request
(这个参数作用将A,AAAA解析请求分别以不同的源端口发起解析请求),可以一定程度减缓。另外一个是调整 /etc/resolv.conf中的timeout参数,将timeout时间响应缩短,加快AAAA超时快速失败。
代码语言:javascript复制[root@localhost ~]# curl -w %{time_namelookup}::%{time_connect}::%{time_starttransfer}::%{time_total}::%{speed_download}"n" http://thirdwx.qlogo.cn/mmopendr/vi_32/Q0j4TwGJoy23A
1.001::1.034::1.067::1.067::0.000
2. 禁止系统OS的ipv6只是可能改变应用层的socket某些默认参数,但不一定能够完全禁止应用发起AAAA的查询,核心在于glibc的域名解析函数getaddrinfo()的行为受到在创建socket时候有很多影响参数: 例如创建socket地址族(AF_INET仅仅ipv4,AF_INET6仅仅ipv6, AF_UNSPEC ipv4/ipv6) glibc默认hints.ai_family为 AF_UNSPEC, 所以禁止OS的ipv6后有的应用没有效果(例如ssh,telnet使用地址族,os禁止ipv6无效), 这一块较为复杂,无ipv6场景建议先禁止掉ipv6。
三. 中间dns环节的调整, 让中间环节nagetive cache强制缓存 AAAA的解析结果避免reclusive。
github上搜索了一下,发现dnsmasq 有一个disable-aaaa的patch外,无现成解决方案,bind虽然有 filter-aaaa-on-v4 yes的配置项,但是对客户端发起的AAAA解析请求仍然以AAAA进行递归, bind无解。另外公共DNS对SERVFAIL的缓存时间比bind默认的1s长,稍微可以缓解递归带来的时间开销(这个就是故障开始的时候,将nameserver调整为公共dns后接口响应很快的原因)。另外还有一种办法是在bind所在服务器上启用iptables规则,过滤掉aaaa的请求/drop掉AAAA请求。