K8S故障记录:KubeDeploymentReplicasMismatch / KubeDaemonSetRolloutStuck
1. 故障描述
昨天,工作邮箱收到了两份来自Prometheus-AlertManager的告警邮件:
告警一:KubeDeploymentReplicasMismatch**
关键信息:
alertname = KubeDeploymentReplicasMismatch
message = Deployment kube-system/traefik has not matched the expected number of replicas for longer than 15 minutes.
告警二:KubeDaemonSetRolloutStuck
关键信息:
alertname = KubeDaemonSetRolloutStuck
message = Only 83.33% of the desired Pods of DaemonSet kube-system/kube-router are scheduled and ready.
这两份告警啥意思呢?我们可以结合K8S的命令来看:
[root@k8s-operation ~]# kubectl get daemonset
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-router 6 6 5 6 5 <none> 563d
kube-router是一个DaemonSet类型的资源,预期是6个Pod(因为有6个Node),现在也运行着6个Pod,但是就绪状态只有5个,然后就出现了上面的告警邮件。
但是,我们再看下Pod的状态:
[root@k8s-operation ~]# kubectl get pods | grep kube-router
NAME READY STATUS RESTARTS AGE
kube-router-8rrwh 1/1 Running 0 563d
kube-router-czngl 1/1 Running 1 563d
kube-router-d9765 1/1 Running 0 563d
kube-router-d9gqj 1/1 Running 0 563d
kube-router-ntqkj 1/1 Running 5 563d
kube-router-qgw4s 1/1 Running 3 563d
再去Node上看下kube-router进程的状态:
[root@k8s-noderouter01 ~]# ps -ef | grep kube-router
root 4113 4092 4 May28 ? 8-01:37:11 /usr/local/bin/kube-router --run-router=true --run-firewall=true --run-service-proxy=true --advertise-cluster-ip=true --advertise-loadbalancer-ip=true --advertise-pod-cidr=true --advertise-external-ip=true --cluster-asn=64512 --metrics-path=/metrics --metrics-port=20241 --enable-cni=true --enable-ibgp=true --enable-overlay=true --nodeport-bindon-all-ip=true --nodes-full-mesh=true --enable-pod-egress=true --cluster-cidr=10.233.0.0/16 --v=2" --kubeconfig=/var/lib/kube-router/kubeconfig
root 1011299 1010285 0 09:49 pts/0 00:00:00 grep --color=auto kube-router
[root@k8s-noderouter01 ~]# netstat -tulnp | grep kube-router
tcp 0 0 127.0.0.1:50051 0.0.0.0:* LISTEN 4113/kube-router
tcp 0 0 192.168.1.43:50051 0.0.0.0:* LISTEN 4113/kube-router
tcp 0 0 192.168.1.43:179 0.0.0.0:* LISTEN 4113/kube-router
tcp6 0 0 :::20241 :::* LISTEN 4113/kube-router
tcp6 0 0 :::20244 :::* LISTEN 4113/kube-router
可以看到,6个kube-router实际都是READY状态,且进程也是正常的,那为啥DaemonSet状态显示只有5个就绪呢?
初步排查后,发现是告警的对象(DaemonSet和Deployment)都属于同一个Node(k8s-noderouter01),所以猜测是不是这个Node出问题了?用命令查看了下Node的状态,还有K8S组件的状态:
[root@k8s-operation ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master01 Ready master 563d v1.15.6
k8s-master02 Ready master 563d v1.15.6
k8s-master03 Ready master 563d v1.15.6
k8s-node01 Ready node 2y334d v1.15.6
k8s-node02 Ready node 2y334d v1.15.6
k8s-noderouter01 Ready node 2y334d v1.15.6
[root@k8s-operation ~]# kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
发现一切正常。再查看下该Node上kubelet服务的状态:
[root@k8s-noderouter01 ~]# systemctl status kubelet
● kubelet.service - Kubernetes Kubelet
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Active: active (running) since Sat 2022-05-28 10:36:57 CST; 5 months 11 days ago
Main PID: 2993 (kubelet)
Tasks: 0
Memory: 136.0M
CGroup: /system.slice/kubelet.service
‣ 2993 /apps/kubernetes/bin/kubelet --bootstrap-kubeconfig=/apps/kubernetes/config/bootstrap.kubeconfig --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/apps...
Nov 08 09:45:20 k8s-noderouter01 kubelet[2993]: I1108 09:45:20.711647 2993 setters.go:73] Using node IP: "192.168.1.43"
Nov 08 09:45:30 k8s-noderouter01 kubelet[2993]: I1108 09:45:30.769828 2993 setters.go:73] Using node IP: "192.168.1.43"
Nov 08 09:45:40 k8s-noderouter01 kubelet[2993]: I1108 09:45:40.834560 2993 setters.go:73] Using node IP: "192.168.1.43"
发现也是正常的,kubelet的日志也没有相关的报错信息。
故障排查到这里,鄙人也就排查不下去了。
后来,去查了下资料,Github上kubernetes有相关issues,才知道这是一个BUG(2019年左右),相关信息请参考:
- https://github.com/kubernetes/kubernetes/issues/82346
- https://github.com/kubernetes/kubernetes/issues/80968
- https://github.com/kubernetes/kubernetes/pull/83455
似乎在kubernetes 1.18+后续版本中已经修复。
2. 故障处理
虽然故障的原因没能找到,但是这个故障修复却意外的简单:重启大法,重启受影响的Pod,或者重启kubelet服务。
方法一:重启受影响的Pod
找到那个被DaemonSet/Deployment误认为是未就绪的Pod,重启他即可(K8S没有重启Pod的命令,直接删除该Pod即可,DaemonSet/Deployment控制器会自动启动一个新的Pod):
[root@k8s-operation ~]# kubectl get pods -o wide | grep kube-router | grep k8s-noderouter01
kube-router-qgw4s 1/1 Running 3 563d 192.168.1.43 k8s-noderouter01
[root@k8s-operation ~]# kubectl delete pods kube-router-qgw4s
待新的Pod就绪后,查看kube-router DaemonSet的状态,已经正常:
[root@k8s-operation ~]# kubectl get daemonset
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-router 6 6 6 6 6 <none> 563d
方法二:重启kubelet
[root@k8s-noderouter01 ~]# systemctl restart kubelet
重启完kubelet之后,记得观察下kubelet的状态:
[root@k8s-noderouter01 ~]# systemctl status kubelet