上文最后讲到,经过一番努力,排除了基础设施的问题,集中精力查找Kubernetes配置错误,现在要抓紧最后一根救命稻草——日志,期待能从日志中找到解决问题的蛛丝马迹。
在按下浏览器刷新按钮时,我是非常忐忑的,如果日志没有任何异常提示,我就没有任何其他线索可以继续找下去了。
从master节点开始,etcd.log无异常,flanneld.log无异常,kubelet.log无异常,…,所有日志均无异常。我担心的事情终于发生了。
然后是minion1节点,flanneld.log无异常,kubelet.log无异常,…,所有日志还是无异常。我已经感觉到一丝丝绝望,甚至开始在心里暗骂,Google这群不靠谱的人,竟然错误都不记录到日志中!
最后是minion2节点,flanneld.log无异常,kubelet.log无异常,…,所有日志……等等!kube-proxy.log里那是什么!
E1129 06:06:19.727461 2540 proxysocket.go:104] Dial failed: dial tcp 10.0.2.15:6443: connection refused E1129 06:06:19.727502 2540 proxysocket.go:104] Dial failed: dial tcp 10.0.2.15:6443: connection refused E1129 06:06:19.727537 2540 proxysocket.go:104] Dial failed: dial tcp 10.0.2.15:6443: connection refused E1129 06:06:19.727570 2540 proxysocket.go:104] Dial failed: dial tcp 10.0.2.15:6443: connection refused E1129 06:06:19.727578 2540 proxysocket.go:138] Failed to connect to balancer: failed to connect to an endpoint.
终于抓到你了!
看到错误,我的第一反应是:一定是kube-proxy配错了。但是到底哪里错我,却没有任何思路。所以要还要仔细看下日志的消息。
盯着这个日志看了3秒钟,猛然发现,这个 10.0.2.15
根本不该出现在这里!这里对于不熟悉vagrant的读者多解释两句,创建虚拟设施的Vagrantfile在 这里 ,为了让虚拟机之间可以通信,我配置了 private_network
,其实就是VirtualBox里的host-only网络,但是,vagrant为了管理虚拟机,会默认给每个虚拟机创建一个NAT网卡,这个网卡的IP就是 10.0.2.15
,所有的虚拟机都是这个IP,互相之间不能通过这块NAT网卡互联。这样每个虚拟机就会有两个网卡,一个是NAT网卡,一个是Private Network网卡。而NAT网卡是 eth0
,也就是系统默认网卡。Kubernetes集群内部通信是通过Private Network,而对外提供服务,是通过apiserver,用的也是Private Networks。这里出现了 10.0.2.5
,说明Kubernetes的kube-proxy的服务绑定了 eth0
网卡,而不是期望的 eth1
网卡!
为什么官方文档里没有提这件事?因为官方部署Ubuntu的文档针对的目标环境是裸机,裸机通常只有一块网卡,而即使有两块,不同机器之间也应该可以通过默认网卡互联。所以不会出现问题。
想到这,我立即去查如何设置kube-proxy绑定网卡,最直接的,就是在命令行里运行 kube-proxy
:
vagrant@master:/opt/bin$ ./kube-proxy --help Usage of ./kube-proxy: --alsologtostderr[=false]: log to standard error as well as files --bind-address=0.0.0.0: The IP address for the proxy server to serve on (set to 0.0.0.0 for all interfaces) --cleanup-iptables[=false]: If true cleanup iptables rules and exit. --healthz-bind-address=127.0.0.1: The IP address for the health check server to serve on, defaulting to 127.0.0.1 (set to 0.0.0.0 for all interfaces) --healthz-port=10249: The port to bind the health check server. Use 0 to disable. --hostname-override="": If non-empty, will use this string as identification instead of the actual hostname. --iptables-sync-period=30s: How often iptables rules are refreshed (e.g. '5s', '1m', '2h22m'). Must be greater than 0. --kubeconfig="": Path to kubeconfig file with authorization information (the master location is set by the master flag). --log-backtrace-at=:0: when logging hits line file:N, emit a stack trace --log-dir="": If non-empty, write log files in this directory --log-flush-frequency=5s: Maximum number of seconds between log flushes --logtostderr[=true]: log to standard error instead of files --masquerade-all[=false]: If using the pure iptables proxy, SNAT everything --master="": The address of the Kubernetes API server (overrides any value in kubeconfig) --oom-score-adj=-999: The oom-score-adj value for kube-proxy process. Values must be within the range [-1000, 1000] --proxy-mode="": Which proxy mode to use: 'userspace' (older, stable) or 'iptables' (experimental). If blank, look at the Node object on the Kubernetes API and respect the 'net.experimental.kubernetes.io/proxy-mode' annotation if provided. Otherwise use the best-available proxy (currently userspace, but may change in future versions). If the iptables proxy is selected, regardless of how, but the system's kernel or iptables versions are insufficient, this always falls back to the userspace proxy. --proxy-port-range=: Range of host ports (beginPort-endPort, inclusive) that may be consumed in order to proxy service traffic. If unspecified (0-0) then ports will be randomly chosen. --resource-container="/kube-proxy": Absolute name of the resource-only container to create and run the Kube-proxy in (Default: /kube-proxy). --stderrthreshold=2: logs at or above this threshold go to stderr --udp-timeout=250ms: How long an idle UDP connection will be kept open (e.g. '250ms', '2s'). Must be greater than 0. Only applicable for proxy-mode=userspace --v=0: log level for V logs --version=false: Print version information and quit --vmodule=: comma-separated list of pattern=N settings for file-filtered logging
扫了一眼,最有嫌疑的配置就是 --bind-address
,不过,从说明来看,这个配置默认 0.0.0.0
,应该是接收所有网卡的请求。有点矛盾,但死马当活马医,试试再说。于是修改 /etc/default/kube-proxy
,添加一个参数:
KUBE_PROXY_OPTS=" --master=http://172.28.128.3:8080 --logtostderr=true --bind-address=172.28.128.5"
重启kube-proxy。
Finger crossed。
刷新一下页面。shit!还是那样!
看来这个办法不行,改回原来默认的配置,要重新理一下思路。
不能再这么“东一耙子西一扫帚”(东北人都懂)了,现在抓到了kube-proxy.log里的关键线索,错误一定跟它有关。这个时候不能急,要冷静,先想清楚到底是怎么回事。
回想一下Kubernetes的架构(是的,在玩Quick Start之前,我当然要基本了解下Kubernetes):
这个kube-proxy到底在这个架构里是一个什么作用?看来又要施展我的Google技能了。
长话短说,搜了几篇没什么营养的入门文章之后,在官方文档中,找到了 我想要的答案 ,总结一下内容:
config-default.sh
时,有两个IP段需要配置; 搞清楚kube-proxy的作用了,但是还有点云里雾里,到底是哪里出问题了呢?这时,那个重要的犯罪证据“不该出现的IP” 10.0.2.15
又一次浮现出来。有没有可能这个IP还出现在其他地方,被我之前漏掉了呢?
我决定在所有日志里搜索一下这个IP,可以用 grep
这个命令:
root@minion2:/var/log/upstart# grep -ir "10.0.2.15" . ./kube-proxy.log:E1128 11:28:55.091647 2540 proxysocket.go:104] Dial failed: dial tcp 10.0.2.15:6443: connection refused ./kube-proxy.log:E1129 06:06:19.727461 2540 proxysocket.go:104] Dial failed: dial tcp 10.0.2.15:6443: connection refused ./kube-proxy.log:E1129 06:06:19.727502 2540 proxysocket.go:104] Dial failed: dial tcp 10.0.2.15:6443: connection refused ./kube-proxy.log:E1129 06:06:19.727537 2540 proxysocket.go:104] Dial failed: dial tcp 10.0.2.15:6443: connection refused ./kube-proxy.log:E1129 06:06:19.727570 2540 proxysocket.go:104] Dial failed: dial tcp 10.0.2.15:6443: connection refused ./network-interface-eth0.log:DHCPREQUEST of 10.0.2.15 on eth0 to 255.255.255.255 port 67 (xid=0x681829be) ./network-interface-eth0.log:DHCPOFFER of 10.0.2.15 from 10.0.2.2 ./network-interface-eth0.log:DHCPACK of 10.0.2.15 from 10.0.2.2 ./network-interface-eth0.log:bound to 10.0.2.15 -- renewal in 34886 seconds. ./network-interface-eth0.log:DHCPREQUEST of 10.0.2.15 on eth0 to 255.255.255.255 port 67 (xid=0x18078729) ./network-interface-eth0.log:DHCPOFFER of 10.0.2.15 from 10.0.2.2 ./network-interface-eth0.log:DHCPACK of 10.0.2.15 from 10.0.2.2 ./network-interface-eth0.log:bound to 10.0.2.15 -- renewal in 37295 seconds. ./flanneld.log:I1128 11:20:37.077746 02534 main.go:188] Using 10.0.2.15 as external interface ./flanneld.log:I1128 11:20:37.078771 02534 main.go:189] Using 10.0.2.15 as external endpoint ./flanneld.log:I1128 11:20:37.094503 02534 etcd.go:129] Found lease (172.16.10.0/24) for current IP (10.0.2.15), reusing
果然在falnneld.log里还有漏网之鱼!看到这个搜索结果,在加上前面看到的kube-proxy原理,把这些线索串到一起,整个事情突然清晰起来:
eth0
识别成要监听的网卡; 10.0.2.15
这个IP跟flanneld进行通信,但是因为这个IP是vagrant的NAT网卡IP,所以失败了; Dial failed: dial tcp 10.0.2.15:6443: connection refused
。 终于定位到你了,flanneld,你这个罪魁祸首!
既然定位到了flanneld,剩下的事情就是怎么修复这个问题。还是先看看flanneld的参数都有哪些。
root@minion2:/opt/bin# ./flanneld --help Usage: ./flanneld [OPTION]... -alsologtostderr=false: log to standard error as well as files -etcd-cafile="": SSL Certificate Authority file used to secure etcd communication -etcd-certfile="": SSL certification file used to secure etcd communication -etcd-endpoints="http://127.0.0.1:4001,http://127.0.0.1:2379": a comma-delimited list of etcd endpoints -etcd-keyfile="": SSL key file used to secure etcd communication -etcd-prefix="/coreos.com/network": etcd prefix -help=false: print this message -iface="": interface to use (IP or name) for inter-host communication -ip-masq=false: setup IP masquerade rule for traffic destined outside of overlay network -listen="": run as server and listen on specified address (e.g. ':8080') -log_backtrace_at=:0: when logging hits line file:N, emit a stack trace -log_dir="": If non-empty, write log files in this directory -logtostderr=false: log to standard error instead of files -networks="": run in multi-network mode and service the specified networks -public-ip="": IP accessible by other nodes for inter-host communication -remote="": run as client and connect to server on specified address (e.g. '10.1.2.3:8080') -remote-cafile="": SSL Certificate Authority file used to secure client/server communication -remote-certfile="": SSL certification file used to secure client/server communication -remote-keyfile="": SSL key file used to secure client/server communication -stderrthreshold=0: logs at or above this threshold go to stderr -subnet-dir="/run/flannel/networks": directory where files with env variables (subnet, MTU, ...) will be written to -subnet-file="/run/flannel/subnet.env": filename where env variables (subnet, MTU, ... ) will be written to -v=0: log level for V logs -version=false: print version and exit -vmodule=: comma-separated list of pattern=N settings for file-filtered logging
看到 -iface=""
这个参数的说明,基本上已经可以确认,就是它了!照这个思路开始改:
root@minion2:/opt/bin# vi /etc/default/flanneld root@minion2:/opt/bin# cat /etc/default/flanneld FLANNEL_OPTS="-iface=eth1 --etcd-endpoints=http://172.28.128.3:4001" root@minion2:/opt/bin# service flanneld restart flanneld stop/waiting flanneld start/running, process 4495 root@minion2:/opt/bin# tail -f /var/log/upstart/flanneld.log -n 50 I1128 11:20:37.076880 02534 main.go:275] Installing signal handlers I1128 11:20:37.077355 02534 main.go:130] Determining IP address of default interface I1128 11:20:37.077746 02534 main.go:188] Using 10.0.2.15 as external interface I1128 11:20:37.078771 02534 main.go:189] Using 10.0.2.15 as external endpoint I1128 11:20:37.094503 02534 etcd.go:129] Found lease (172.16.10.0/24) for current IP (10.0.2.15), reusing I1128 11:20:37.096168 02534 etcd.go:84] Subnet lease acquired: 172.16.10.0/24 I1128 11:20:37.097640 02534 udp.go:222] Watching for new subnet leases I1129 11:59:35.035222 02534 main.go:292] Exiting... I1129 11:59:35.075646 04495 main.go:275] Installing signal handlers I1129 11:59:35.078235 04495 main.go:188] Using 172.28.128.5 as external interface I1129 11:59:35.078257 04495 main.go:189] Using 172.28.128.5 as external endpoint I1129 11:59:35.093587 04495 etcd.go:204] Picking subnet in range 172.16.1.0 ... 172.16.255.0 I1129 11:59:35.096558 04495 etcd.go:84] Subnet lease acquired: 172.16.29.0/24 I1129 11:59:35.106917 04495 udp.go:222] Watching for new subnet leases I1129 11:59:35.108491 04495 udp.go:247] Subnet added: 172.16.10.0/24
从这个日志提示看,这个改动应该是成功了,于是把其他两个虚拟机也修改一下。激动人心的时刻到了,终于能见到美丽的kube-ui,想想还有点小激动!
打开浏览器,刷新!
什么都没有发生!还是那个让我抓狂的页面!
这个结果对我确实是一个打击,但是我比较确信这个解决方案的方向是正确的。一个可能的原因是flanneld需要和其他组件配合,虽然flanneld已经改好,但其他配合的组件还需要修改。要验证这个假设,需要去了解Kubernetes启动的具体步骤,有两种方式可以选择:
鉴于前面的经验,估计Google出来的结果可能帮助不大,所以我首选自己先看脚本。花了半个多小时,搞清楚整个启动脚本的结构, kube-up.sh
只是一个入口,会根据KUBERNETES_PROVIDER环境变量的值,选择调用不同的配置脚本。 kube-up.sh
里的方法只是类似虚函数的空实现,具体的逻辑由各个平台对应的脚本重写实现。我用到的脚本都在 kubernetes/cluster/ubuntu/
路径下。不愧是Google出品,连Shell脚本都能做成面向接口编程。
读脚本的收获很大,但是并没有解决我的问题,因为Shell脚本的可读性实在是不可恭维。所以又要靠Google了。
在Google里输入“kubernetes fanneld iface”等关键字,没什么有价值的发现。既然直接找不行,那就间接找,换个关键字,“kubernetes multinode”,希望能找到一个详细讲手动部署多节点Kubernetes集群的说明。
功夫不负有心人,终于找到一篇,是讲 如何用docker部署多节点Kubernetes集群 ,虽然这个文档也是通过自动化脚本进行配置,但它对每一个脚本都写了个说明文档,比如 这个讲 worker.sh
脚本的 。里面比较关键的是这一段:
总结起来就是:
/run/flannel/subnet.env
里; 这就是那传说中的“丢失的一步”!照着这个文档修改配置:
root@minion2:/home/vagrant# cat /run/flannel/subnet.env FLANNEL_NETWORK=172.16.0.0/16 FLANNEL_SUBNET=172.16.29.1/24 FLANNEL_MTU=1472 FLANNEL_IPMASQ=false root@minion2:/home/vagrant# cat /etc/default/docker DOCKER_OPTS=" -H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock --bip=172.16.10.1/24 --mtu=1472" root@minion2:/home/vagrant# vi /etc/default/docker root@minion2:/home/vagrant# cat /etc/default/docker DOCKER_OPTS=" -H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock --bip=172.16.29.1/24 --mtu=1472"
接着把另外两个节点也修改一下。别忘了重启docker服务。
打开浏览器,刷新!“嘭”!
看到Dashboard终于出现,内心早已汹涌澎湃,但表面上还是要很平静,毕竟我是专家啊。
回顾前面的整个过程,得到以下两条经验: