环境:
系统:统信有岳 1060A
集群:统信有雀裸金属部署
问题来源
有雀集群基础节点DNS配置
[root@bastion ~]# cat /etc/coredns/Corefile
.:53 {
template IN A apps.utccp.example.com {
match .*apps\.utccp\.example\.com
answer "{{ .Name }} 60 IN A 10.12.24.125"
fallthrough
}
hosts {
10.12.24.125 api.utccp.example.com
10.12.24.125 api-int.utccp.example.com
10.12.24.125 bastion.utccp.example.com
10.12.24.127 master1.utccp.example.com
10.12.24.128 master2.utccp.example.com
10.12.24.129 master3.utccp.example.com
fallthrough
}
prometheus
cache 160
forward . 114.114.114.114
log
}
已发现问题:
环境: 有雀默认配置, 运行48小时数据.总请求数: 660,100
类型 | 数量 | 问题 | 比例 |
ipv4请求 | 398826 | 正常请求 | 0.60419 |
ipv6请求 | 206733 | 集群默认为ipv4,无ipv6网络 | 0.31318 |
重复基域请求 | 54541 | 集群基域重复会造成 | 0.08262 |
据以上数据可以看出非正常请求比例在40%左右.
问提分析
- 首先得排查ipv6请求来源.
在离线集群默认状况下, DNS请求基本都来源于各个组件的容器,系统服务的DNS请求可以忽略不记. - 排查重复基域出现原因.
默认状况下,各个节点 hostname 都会带有基域, hostname 是通过hostnamectl set-hostname
配置的,可能会出现问题.
关闭 ipv6 DNS解析请求 (客户端侧)
- 在NetworkManager 关闭ipv6.
# nmcli connection modify enp1s0 ipv6.method disabled
# systemctl restart NetworkManager
发现依然会有ipv6 dns请求到达基础节点.
- 关闭系统的内核参数, 在各节点执行
# sysctl -w net.ipv6.conf.all.disable_ipv6=1
# sysctl -w net.ipv6.conf.all.disable_policy=1
经过以上两个步骤还是会有ipv6 dns请求到达基础节点.
- 关闭avahi-daemon 的ipv6
# vim /etc/avahi/avahi-daemon.conf
设置
use-ipv6=no
# systemctl restart avahi-daemon
依然不行
- 设置 /etc/resolv.conf
options single-request-reopen
# systemctl restart NetworkManager
依然不行
- 修改 OVS配置
# vim /etc/openvswitch/ovs-vswitchd.conf.db
other_config:ipv6_prefix=[]
# systemctl restart openvswitch.service
依然不行
- 在 /etc/hosts 注释ipv6本地
[root@worker1 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
依然不行
- 禁用 /etc/gai.conf
precedence ::ffff:0:0/96 100
依然不行
- 去掉内核模块
modprobe -r ipv6
# 内建模块无法卸载
- 在经过查找文档,发现 rfc4472 第5.1章 描述
5.1. DNS Lookups May Query IPv6 Records Prematurely
The system library that implements the getaddrinfo() function for
looking up names is a critical piece when considering the robustness
of enabling IPv6; it may come in basically three flavors:
1. The system library does not know whether IPv6 has been enabled in
the kernel of the operating system: it may start looking up AAAA
records with getaddrinfo() and AF_UNSPEC hint when the system is
upgraded to a system library version that supports IPv6.
2. The system library might start to perform IPv6 queries with
getaddrinfo() only when IPv6 has been enabled in the kernel.
However, this does not guarantee that there exists any useful
IPv6 connectivity (e.g., the node could be isolated from the
other IPv6 networks, only having link-local addresses).
3. The system library might implement a toggle that would apply some
heuristics to the "IPv6-readiness" of the node before starting to
perform queries; for example, it could check whether only link-
local IPv6 address(es) exists, or if at least one global IPv6
address exists.
First, let us consider generic implications of unnecessary queries
for AAAA records: when looking up all the records in the DNS, AAAA
records are typically tried first, and then A records. These are
done in serial, and the A query is not performed until a response is
received to the AAAA query. Considering the misbehavior of DNS
servers and load-balancers, as described in Section 3.1, the lookup
delay for AAAA may incur additional unnecessary latency, and
introduce a component of unreliability.
- 加 grub参数禁用ipv6
ipv6.disable=1
[root@worker1 ~]# ss -tunlp
Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
udp UNCONN 0 0 0.0.0.0:111 0.0.0.0:* users:(("rpcbind",pid=793,fd=6))
udp UNCONN 0 0 0.0.0.0:33062 0.0.0.0:* users:(("avahi-daemon",pid=799,fd=16))
udp UNCONN 0 0 127.0.0.1:323 0.0.0.0:* users:(("chronyd",pid=817,fd=6))
udp UNCONN 0 0 0.0.0.0:4789 0.0.0.0:*
udp UNCONN 0 0 0.0.0.0:5353 0.0.0.0:* users:(("avahi-daemon",pid=799,fd=15))
udp UNCONN 0 0 0.0.0.0:55162 0.0.0.0:* users:(("rpcbind",pid=793,fd=7))
tcp LISTEN 0 128 0.0.0.0:111 0.0.0.0:* users:(("rpcbind",pid=793,fd=8))
tcp LISTEN 0 128 0.0.0.0:22 0.0.0.0:* users:(("sshd",pid=1208,fd=3))
tcp LISTEN 0 5 127.0.0.1:631 0.0.0.0:* users:(("cupsd",pid=1327,fd=10))
虽然关闭了所有ipv6地址相关信息,但还是无法阻止ipv6dns 解析请求。
- 经谷歌发现,在glibc 2.36 版本修复此bug,在 /etc/resolv.conf 添加了 option no-aaaa 。 链接
* The “no-aaaa” DNS stub resolver option has been added. System
administrators can use it to suppress AAAA queries made by the stub
resolver, including AAAA lookups triggered by NSS-based interfaces
such as getaddrinfo. Only DNS lookups are affected: IPv6 data in
/etc/hosts is still used, getaddrinfo with AI_PASSIVE will still
produce IPv6 addresses, and configured IPv6 name servers are still
used. To produce correct Name Error (NXDOMAIN) results, AAAA queries
are translated to A queries. The new resolver option is intended
primarily for diagnostic purposes, to rule out that AAAA DNS queries
have adverse impact. It is incompatible with EDNS0 usage and DNSSEC
validation by applications.
目前我使用的系统glibc版本还不够高,又不能随便升级。所有为了规避A AAAA 同时请求造成的超时(按照 /etc/resolv.conf 解释说,当请求A和AAAA记录时,A请求到了但AAAA没请求到会有5秒的超时时间-----恐怖)。 所以就有了下文。
服务端侧(规避)
既然我无法解决glibc的问题,那我就对基础节点的coredns下手。
解释: 默认情况的DNS请求,会同时请求A和AAAA记录,但如果在服务端立即打回AAAA记录,可能会比A记录更快,这就避免了5秒的timeout时间了。
目前想到两种解决方法
- 用rewrite 直接拒绝AAAA记录 (保险 安全, 推荐)
.:53 {
rewrite stop type AAAA A
template IN A apps.utccp.example.com {
match .*apps\.utccp\.example\.com
answer "{{ .Name }} 60 IN A 10.12.24.125"
fallthrough
}
hosts {
10.12.24.125 api.utccp.example.com
10.12.24.125 api-int.utccp.example.com
10.12.24.125 bastion.utccp.example.com
10.12.24.127 master1.utccp.example.com
10.12.24.128 master2.utccp.example.com
10.12.24.129 master3.utccp.example.com
fallthrough
}
prometheus
cache 160
forward . 114.114.114.114
log
}
- 让AAAA记录打回 NXDOMAIN (不怎么推荐,因为AAAA记录会有错误信息)
.:53 {
template IN AAAA {
rcode NXDOMAIN
}
template IN A apps.utccp.example.com {
match .*apps\.utccp\.example\.com
answer "{{ .Name }} 60 IN A 10.12.24.125"
fallthrough
}
hosts {
10.12.24.125 api.utccp.example.com
10.12.24.125 api-int.utccp.example.com
10.12.24.125 bastion.utccp.example.com
10.12.24.127 master1.utccp.example.com
10.12.24.128 master2.utccp.example.com
10.12.24.129 master3.utccp.example.com
fallthrough
}
prometheus
cache 160
forward . 114.114.114.114
log
}
时间关系,还有一个重复基域的问题留到下期。
参考链接:
- resolv手册:https://man7.org/linux/man-pages/man5/resolv.conf.5.html
- dns标准: https://www.rfc-editor.org/rfc/rfc4472.html
- openshift 文档: https://docs.openshift.com/container-platform/4.13/rest_api/operator_apis/dns-operator-openshift-io-v1.html#spec-upstreamresolvers
- 好心人提的bug:https://bugzilla.redhat.com/show_bug.cgi?id=1027452
- glibc更新日志: https://lists.gnu.org/archive/html/info-gnu/2022-08/msg00000.html
- 好心人克隆的glibc源码: https://github.com/bminor/glibc/tree/ibm/2.28/master
- 好心人提的glib patch: https://sourceware.org/pipermail/libc-alpha/2022-June/139341.html
- 有道翻译:https://fanyi.youdao.com