本文主要解决以下几个问题:
- pod无法访问ClusterIP
- busybox做dns查询
- pod间互访及访问外网
问题1: pod无法访问ClusterIP
step0 - kube-proxy日志
从kube-proxy的日志中看到Unknown proxy mode "", assuming iptables proxy
$ kubectl logs -n kube-system kube-proxy-5n29r | more
W0720 03:22:47.942827 1 server_others.go:559] Unknown proxy mode "", assuming iptables proxy
I0720 03:22:48.245820 1 node.go:136] Successfully retrieved node IP: 10.160.18.183
I0720 03:22:48.245876 1 server_others.go:186] Using iptables Proxier.
I0720 03:22:48.246253 1 server.go:583] Version: v1.18.3
I0720 03:22:48.395170 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
I0720 03:22:48.395210 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0720 03:22:48.395578 1 conntrack.go:83] Setting conntrack hashsize to 32768
I0720 03:22:48.414004 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I0720 03:22:48.414067 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I0720 03:22:48.533638 1 config.go:315] Starting service config controller
I0720 03:22:48.533673 1 shared_informer.go:223] Waiting for caches to sync for service config
I0720 03:22:48.533997 1 config.go:133] Starting endpoints config controller
I0720 03:22:48.534016 1 shared_informer.go:223] Waiting for caches to sync for endpoints config
step1 - 安装相关包
# 在所有节点上
$ apt install ipset ipvsadm
step2 - 加载module
$ modprobe -- ip_vs
$ modprobe -- ip_vs_rr
$ modprobe -- ip_vs_wrr
$ modprobe -- ip_vs_sh
$ modprobe -- nf_conntrack
# 注:很多帖子上都写的是nf_conntrack_ipv4,但在Ubuntu20.04上,这个已经变成了nf_conntrack
# 检查配置
$ lsmod | grep -e ipvs -e nf_conntrack
nf_conntrack_netlink 45056 0
nfnetlink 16384 3 nf_conntrack_netlink,ip_set
nf_conntrack 139264 5 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE,ip_vs
nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs
nf_defrag_ipv4 16384 1 nf_conntrack
libcrc32c 16384 3 nf_conntrack,nf_nat,ip_vs
# 再注:Ubuntu 20.04上这些module可能默认已经加载了
注:如果ipvs默认没有加载的话,需要写一个脚本,系统重启时也需要加载
step3 - 修改kube-proxy配置文件
修改kube-proxy的configmap中的mode字段为ipvs
$ kubectl edit configmap kube-proxy -n kube-system
...
kind: KubeProxyConfiguration
metricsBindAddress: ""
mode: "ipvs"
nodePortAddresses: null
...
step4 - 重启kube-proxy
可以逐个删除kube-proxy的pod,由k8s自动重启,也可以批量删除
$ kubectl get pod -n kube-system | grep kube-proxy |awk '{system("kubectl delete pod "$1" -n kube-system")}'
查看kube-proxy的日志
$ kubectl logs -n kube-system kube-proxy-44zw5
I0720 05:37:30.026304 1 node.go:136] Successfully retrieved node IP: 10.160.18.181
I0720 05:37:30.026349 1 server_others.go:259] Using ipvs Proxier.
W0720 05:37:30.026600 1 proxier.go:429] IPVS scheduler not specified, use rr by default
I0720 05:37:30.026814 1 server.go:583] Version: v1.18.3
I0720 05:37:30.027200 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0720 05:37:30.027452 1 config.go:133] Starting endpoints config controller
I0720 05:37:30.027474 1 shared_informer.go:223] Waiting for caches to sync for endpoints config
I0720 05:37:30.027507 1 config.go:315] Starting service config controller
I0720 05:37:30.027529 1 shared_informer.go:223] Waiting for caches to sync for service config
I0720 05:37:30.127736 1 shared_informer.go:230] Caches are synced for endpoints config
I0720 05:37:30.127790 1 shared_informer.go:230] Caches are synced for service config
可以看到Using ipvs Proxier,说明IPVS已经启用了
现在,可以启动一个busybox的container来ping一下coredns的clusterIP了
问题2: busybox做dns查询失败
step0 - 问题现象
在解决了pod无法访问dns clusterIP的问题之后,发现busybox还是无法解析到某个service的IP
kubectl run -i --tty --image busybox dns-test --restart=Never --rm /bin/sh
If you don't see a command prompt, try pressing enter.
/ # nslookup web-0.nginx
;; connection timed out; no servers could be reached
上网查完后发现:busybox的版本高于1.28.4都存在这个问题
step1 - 解决方法
使用1.28.4的busybox镜像执行dns查询
$ kubectl run -i --tty --image busybox:1.28.4 dns-test --restart=Never --rm /bin/sh
If you don't see a command prompt, try pressing enter.
/ # nslookup web-0.nginx
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: web-0.nginx
Address 1: 172.1.2.175 web-0.nginx.default.svc.cluster.local
/ # ping web-0.nginx
PING web-0.nginx (172.1.2.175): 56 data bytes
64 bytes from 172.1.2.175: seq=0 ttl=62 time=1.050 ms
64 bytes from 172.1.2.175: seq=1 ttl=62 time=0.432 ms
问题3: pod互访及访问外网不通
问题原因:iptables
解决方法:
分别在每个节点上执行
$ iptables -P INPUT ACCEPT
$ iptables -P FORWARD ACCEPT
$ iptables -F
$ iptables -L -n
$ iptables -t nat -I POSTROUTING -s 172.1.2.0/24 -j MASQUERADE
# 注:172.1.2.0/24是每个节点的pod-network-cidr
测试
$ kubectl run -i --tty --image busybox:1.28.4 connect-test --restart=Never --rm -- /bin/sh
If you don't see a command prompt, try pressing enter.
/ # ping 10.96.0.10 -c 1
PING 10.96.0.10 (10.96.0.10): 56 data bytes
64 bytes from 10.96.0.10: seq=0 ttl=64 time=0.082 ms
--- 10.96.0.10 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.082/0.082/0.082 ms
/ # ping 10.96.0.1 -c 1
PING 10.96.0.1 (10.96.0.1): 56 data bytes
64 bytes from 10.96.0.1: seq=0 ttl=64 time=0.072 ms
--- 10.96.0.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.072/0.072/0.072 ms
/ # ping 172.1.1.193 -c 1
PING 172.1.1.193 (172.1.1.193): 56 data bytes
64 bytes from 172.1.1.193: seq=0 ttl=64 time=0.108 ms
--- 172.1.1.193 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.108/0.108/0.108 ms
/ # ping 223.5.5.5 -c 1
PING 223.5.5.5 (223.5.5.5): 56 data bytes
64 bytes from 223.5.5.5: seq=0 ttl=114 time=5.659 ms
--- 223.5.5.5 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 5.659/5.659/5.659 ms
/ #
# 注: 10.96.0.1为kube-apiserver的ClusterIP
# 10.96.0.10为coredns的ClusterIP
# 172.1.1.193为一个pod的IP
使iptables规则重启生效
分别在每个节点上执行:
$ iptables-save > /etc/iptables.up.rules
$ echo -e '#!/bin/bash\n/sbin/iptables-restore < /etc/iptables.up.rules' > /etc/network/if-pre-up.d/iptables
$ chmod +x /etc/network/if-pre-up.d/iptables