Lose Connection to K8s Server

Kubernetes version 1.13.2

Server Login Failed

This morning I find I lose the connection with my icp4d kubernetes server (it was good last night), if I run:

1
2
# kubectl get pods
error: the server doesn't have a resource type "pods"

then:

1
2
3
4
# kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.4+icp-ee", GitCommit:"d03f6421b5463042d87aa0211f116ba4848a0d0f", GitTreeState:"clean", BuildDate:"2019-01-17T13:14:09Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

error: You must be logged in to the server (the server has asked for the client to provide credentials)

But it seems kubectl config is good, the token is there:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# kubectl config view
apiVersion: v1
clusters:
- cluster:
insecure-skip-tls-verify: true
server: https://172.16.3.23:8001
name: icp-cluster1
- cluster:
certificate-authority: mycluster/ca.pem
server: https://172.16.3.23:8001
name: mycluster
contexts:
- context:
cluster: icp-cluster1
user: admin
name: ""
- context:
cluster: mycluster
user: mycluster
name: mycluster
- context:
cluster: mycluster
namespace: zen
user: mycluster-user
name: mycluster-context
current-context: mycluster-context
kind: Config
preferences: {}
users:
- name: admin
user:
client-certificate: /etc/cfc/conf/kubecfg.crt
client-key: /etc/cfc/conf/kubecfg.key
- name: mycluster
user:
client-certificate: /ibm/InstallPackage/ibm-cp-app/cluster/cfc-certs/kubernetes/kubecfg.crt
client-key: /ibm/InstallPackage/ibm-cp-app/cluster/cfc-certs/kubernetes/kubecfg.key
- name: mycluster-user
user:
token: eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJhdF9oYXNoIjoiYjVkNDMxZDMwNGZmMGUyYWM0NWJlOTY1NjU5YTQyN2ViOWUwNzE5NCIsInJlYWxtTmFtZSI6ImN1c3RvbVJlYWxtIiwidW5pcXVlU2VjdXJpdHlOYW1lIjoiYWRtaW4iLCJpc3MiOiJodHRwczovL215Y2x1c3Rlci5pY3A6OTQ0My9vaWRjL2VuZHBvaW50L09QIiwiYXVkIjoiYTc1ZTZmZjQ3YzQyZTJhZDA3YjZiMjUzMTVmZTExMTQiLCJleHAiOjE1NjAyMzkwNzEsImlhdCI6MTU2MDIxMDI3MSwic3ViIjoiYWRtaW4iLCJ0ZWFtUm9sZU1hcHBpbmdzIjpbXX0.cwGioosvwjONIllJExWRADicgibShbSl2x05r3hpiMpXQQia_4HDuvfUCNNyvLiFkBfz1xvuoz9JeAkOdRa7QVR0RD8TGVnYyu10S50AQ5b_LjGaTNoxdGJjLLEGkBt5gzJCsZaVw49ttd-lzDV28badpUBtm1cih4-3o-wbM6inJqCqR97ujgImRW0BS0Jj1pbENAEidAquyZscGMje5vyyRc9A67VWWJxZXo0J1fG081yhvaryRWbvinLLSPRm8_eley1GqItUMvRmIpzC-X7xsg4zIvCE8QhPoKrJp2xRFjDwsvCN44wJv9hdkfx3cGxjjOBdg6ofsVkNND5njg

I check the docker and kubelet status, all are active:

1
2
systemctl status docker
systemctl status kubelet

Then I try to reboot all nodes, and it set up correctly. Don’t know how to reproduce this issue, no idea what happened and how to fix it (without rebooting), sadly.

Server Connection Refused 6443

Similar issue happened again in my dstest cluster:

1
2
3
4
5
6
# kubectl get pods -n test-1
The connection to the server 9.30.188.95:6443 was refused - did you specify the right host or port?

# kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-10T23:35:51Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"linux/amd64"}
The connection to the server 9.30.188.95:6443 was refused - did you specify the right host or port?

Check this required ports list.

1
2
# netstat -tunlp | grep 6443
tcp6 0 0 :::6443 :::* LISTEN 27047/kube-apiserve

Note, there is no kube-apiserver service in systemctl, so how to restart it? The kube-apiserver is from a static pod, so I think I can restart the container directly by docker restart <container ID>:

1
2
3
# docker ps -a | grep apiserver
8f661411fa02 177db4b8e93a "kube-apiserver --au..." About an hour ago Up 4 minutes k8s_kube-apiserver_kube-apiserver-dstest1.fyre.ibm.com_kube-system_d175f38c007e23cc443d6ba50ba15533_0
0f05946a8a59 k8s.gcr.io/pause:3.1 "/pause" About an hour ago Up About an hour k8s_POD_kube-apiserver-dstest1.fyre.ibm.com_kube-system_d175f38c007e23cc443d6ba50ba15533_0

Haven’t got chance to reproduce this issue, this solution may not work…

In a health cluster:

1
2
3
4
5
# kbc cluster-info
Kubernetes master is running at https://9.30.188.95:6443
KubeDNS is running at https://9.30.188.95:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Server Connection Refused 8080

This issue is similar to 6443 one, but it shows:

1
The connection to the server localhost:8080 was refused - did you specify the right host or port?

Recall that when we set up K8s cluster by kubeadm, we run:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
...
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
...
Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

I can reproduce this issue if the environment variable KUBECONFIG is missing, so try to export it, both ways are fine:

1
2
export KUBECONFIG=$HOME/.kube/config
export KUBECONFIG=/etc/kubernetes/admin.conf

A good /etc/kubernetes folder has these items:

1
2
3
4
5
6
7
8
# ls -ltr /etc/kubernetes/
total 36
drwxr-xr-x 3 root root 4096 Jun 20 10:02 pki
-rw------- 1 root root 5447 Jun 20 10:02 admin.conf
-rw------- 1 root root 5539 Jun 20 10:02 kubelet.conf
-rw------- 1 root root 5483 Jun 20 10:02 controller-manager.conf
-rw------- 1 root root 5435 Jun 20 10:02 scheduler.conf
drwxr-xr-x 2 root root 113 Jun 20 10:02 manifests

The manifests contains yaml files for creating etcd, kube-apiserver and kube-controller-manager, kube-scheduler.

0%