in Kubernetes

Kubernetes部署集群碰到的问题解决

小结

Kubernetes部署集群碰到与Containerd,Metallb和Coredns相关的问题,进行了解决。

问题

问题一,Coredns一直在Pending的状态,如下:

[root@Master]# kubectl get pods -o wide -A
NAMESPACE      NAME                             READY   STATUS    RESTARTS   AGE     IP                NODE     NOMINATED NODE   READINESS GATES
kube-flannel   kube-flannel-ds-fr92w            1/1     Running   0          26s     192.168.238.130   master   <none>           <none>
kube-system    coredns-5d78c9869d-22jmb         0/1     Pending   0          7m46s   <none>            <none>   <none>           <none>
kube-system    coredns-5d78c9869d-nqwq7         0/1     Pending   0          7m46s   <none>            <none>   <none>           <none>

问题二,Kubernetes的节点一直处于NotReady状态,如下:

[root@Master]# kubectl get nodes
NAME     STATUS     ROLES           AGE     VERSION
master   NotReady   control-plane   2m29s   v1.27.3
node1    NotReady   <none>          68s     v1.27.3
node2    NotReady   <none>          41s     v1.27.3

问题三,部署metallb报找不到资源的问题,如下:

Error from server (NotFound): error when creating "metallb-deployments/address-pool.yml": the server could not find the requested resource (post ipaddresspools.metallb.io)
Error from server (NotFound): error when creating "metallb-deployments/address-pool.yml": the server could not find the requested resource (post l2advertisements.metallb.io)

解决

问题一解决,这里Coredns一直在Pending的状态,需要在Master节点上初始化Kubernetes集群之后,加入其它Node1和Node2之前,配置网络实例,如下:

kubeadm reset --force #重置
rm -rf /etc/cni/net.d/ #按提示要求删除
kubeadm init --pod-network-cidr=10.244.0.0/16 #初始化
kubectl apply -f kube-flannel.yml #配置网络实例
kubeadm join 192.168.238.130:6443 --token s5wfmf.lyohmfasblsxgmzn \
> --discovery-token-ca-cert-hash sha256:693bf9030a37358c8e7efa0380e7bdfc3af55d89139c97f273b2f14631868036 #分别 在Node1和Node2上执行,添加Node1和Node2到集群

注意以上顺序,问题可以解决:

[root@Master]# kubectl get pods -o wide -A
NAMESPACE        NAME                             READY   STATUS    RESTARTS   AGE     IP                NODE     NOMINATED NODE   READINESS GATES
kube-flannel     kube-flannel-ds-6b446            1/1     Running   0          9m29s   192.168.238.132   node2    <none>           <none>
kube-flannel     kube-flannel-ds-hh8qh            1/1     Running   0          10m     192.168.238.130   master   <none>           <none>
kube-flannel     kube-flannel-ds-znq5w            1/1     Running   0          9m56s   192.168.238.131   node1    <none>           <none>
kube-system      coredns-5d78c9869d-q6tt6         1/1     Running   0          11m     10.244.0.6        master   <none>           <none>
kube-system      coredns-5d78c9869d-wwgfm         1/1     Running   0          11m     10.244.0.7        master   <none>           <none>

问题二解决,相当简单直接,需要重启containerd的服务:

[root@Node2 ~]# systemctl restart containerd.service
[root@Node1 ~]# systemctl restart containerd.service

问题解决,如下:

[root@Master]# kubectl get nodes
NAME     STATUS   ROLES           AGE    VERSION
master   Ready    control-plane   5d5h   v1.27.3
node1    Ready    <none>          5d5h   v1.27.3
node2    Ready    </none><none>          5d5h   v1.27.3
[root@Master kubernetes]# 

问题三解决,需要注意部署顺序,如下:

[root@Master]# kubectl apply -f metallb-native.yml 
[root@Master]# kubectl apply -f address-pool.yml

先前顺序搞反了,解决结果如下:

[root@Master]# kubectl get pods -A | tail -6
metallb-system         controller-595f88d88f-94qbc                             1/1     Running   0              5d5h
metallb-system         speaker-5dx5d                                           1/1     Running   1 (53m ago)    5d5h
metallb-system         speaker-8pj7d                                           1/1     Running   1 (53m ago)    5d5h
metallb-system         speaker-sqdc5                                           1/1     Running   0              5d5h
nginx-ingress          nginx-ingress-2tplt                                     1/1     Running   0              4d13h
nginx-ingress          nginx-ingress-82khx                                     1/1     Running   0              4d13h

参考

Github: InternalError (failed calling webhook “ipaddresspoolvalidationwebhook.metallb.io”) #1540
Stackoverflow: Kubernetes coredns readiness probe failed

Write a Comment

Comment