====== K8s Installation for onap ====== Using VMs in Nemo testbed: onap-k8s-h0 onap-k8s-h1 We will install * k8s * helm * k8s-dashboard Equipvalent ONAP task: cloud setup guide https://onap.readthedocs.io/en/latest/submodules/oom.git/docs/oom_cloud_setup_guide.html ===== - K8s installation ===== * k8s installation: https://vitux.com/install-and-deploy-kubernetes-on-ubuntu/ ==== - Init master node: ==== sudo kubeadm init --pod-network-cidr=10.244.0.0/16 ==== - Deploy pod network flannel ==== $ sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml ==== - Join worker nodes ==== === - Install worker nodes === sudo apt install docker.io sudo systemctl enable docker curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add sudo apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main" #sudo apt install kubeadm #better apt-cache policy kubeadm # find the latest 1.16 version in the list # it should look like 1.16.x-00, where x is the latest patch apt-get install -y kubeadm=1.16.2-00 # sudo swapoff -a hostnamectl set-hostname slave-node === - Joint workers === Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.181.52:6443 --token uxk3nc.4n9ny3n6cpfdsmju \ --discovery-token-ca-cert-hash sha256:fb99261a8180cf4e47c5cf561968b28199ef8b4a7d5099a10b97ee4d9e6a13cc **Later create new token for joining worker nodes** kubeadm token create --print-join-command ==== - Remore workers ==== * https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/ kubectl get nodes Next, tell Kubernetes to drain the node: kubectl drain There are pending nodes to be drained: onap-k8s-h3 cannot delete Pods with local storage (use --delete-local-data to override): default/dashboard-demo- kubernetes-dashboard-54557bff5-hf4q8, onap/dep-dcae-dashboard-6b7c68d6d4-99rkm cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-system/kube-flannel-ds- amd64-rvdc9, kube-system/kube-proxy-f779w ubuntu@onap-k8s-h0:~/onap/dublin_deploy_overrides$ kubectl drain onap-k8s-h3 --ignore-daemonsets --delete-local-data Once it returns (without giving an error), you can power down the node (or equivalently, if on a cloud platform, delete the virtual machine backing the node). Delete node kebectl delete nodes node-name OR.. If you leave the node in the cluster during the maintenance operation, you need to run kubectl uncordon afterwards to tell Kubernetes that it can resume scheduling new pods onto the node ==== - Upgrade cluster ==== * https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/ **Check if nodes are ready** kubectl get nodes -o=wide Need to update flannel conf because of this bug: https://github.com/kubernetes/kubernetes/issues/82982 sudo cat /etc/cni/net.d/10-flannel.conflist { "cniVersion": "0.2.0", <--- ADD THIS "name": "cbr0", "plugins": [ Restart kubelet sudo systemctl restart kubelet.service === - === ==== - Downgrade Cluster ==== # install kubeadm apt-mark unhold kubeadm && \ apt-get update && apt-get install -y --allow-downgrades kubeadm=1.15.5-00 && \ apt-mark hold kubeadm # install kubelet apt-mark unhold kubelet kubectl && \ apt-get update && apt-get install -y --allow-downgrades kubelet=1.15.5-00 kubectl=1.15.5-00 && \ apt-mark hold kubelet kubectl # sudo systemctl restart kubelet **Basically reinstalling!!** The kube-system pods are still new. So a reset is required: sudo kubeadm reset sudo kubeadm init --pod-network-cidr=10.244.0.0/16 # Follow the output: replace .kube/config etc kubectl version ==== - Troubleshooting ==== === - node not ready === * check docker client and server version **as root** restart docker if versions are not the same. ubuntu@onap-k8s-h2:~$ sudo systemctl restart docker.service ubuntu@onap-k8s-h2:~$ sudo docker version Client: Version: 18.09.7 API version: 1.39 Go version: go1.10.1 Git commit: 2d0083d Built: Fri Aug 16 14:20:06 2019 OS/Arch: linux/amd64 Experimental: false Server: Engine: Version: 18.09.7 API version: 1.39 (minimum version 1.12) Go version: go1.10.1 Git commit: 2d0083d Built: Wed Aug 14 19:41:23 2019 OS/Arch: linux/amd64 Experimental: false === - Pod not deployed === Check worker nodes kubectl get nodes kubectl describe pods -n kube-system tiller-deploy-6b9c575bfc-95jfl # or ssh worker-node sudo journal -xfeu kubelet If this error: "cni0" already has an IP address different from 10.244.2.1/24。 Error while adding to cni network: failed to allocate for range 0: no IP addresses available in range set: 10.244.2.1-10.244.2.254 rm -rf /var/lib/cni/flannel/* && rm -rf /var/lib/cni/networks/cbr0/* && ip link delete cni0 rm -rf /var/lib/cni/networks/cni0/* kubeadm reset kubeadm join ===== - Helm ===== * Installing helm: https://helm.sh/docs/using_helm/ * Using helm: https://www.digitalocean.com/community/tutorials/how-to-install-software-on-kubernetes-clusters-with-the-helm-package-manager * https://banzaicloud.com/blog/creating-helm-charts/ * ==== - Installing helm ==== curl -L https://git.io/get_helm.sh | bash helm init --history-max 200 helm repo update ==== - Installing Tiller ==== Create the tiller serviceaccount: kubectl -n kube-system create serviceaccount tiller Next, bind the tiller serviceaccount to the cluster-admin role: kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller Now we can run helm init, which installs Tiller on our cluster, along with some local housekeeping tasks such as downloading the stable repo details: helm init --service-account tiller To verify that Tiller is running, list the pods in thekube-system namespace: kubectl get pods --namespace kube-system ===== - K8s Dashboard ===== ==== - K8S dashboard Install with external IP ==== * https://wiki.onap.org/display/DW/5.+Install+and+Use+Kubernetes+UI ubuntu@onap-k8s-h0:~/certs$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply secret/kubernetes-dashboard-certs configured serviceaccount/kubernetes-dashboard created role.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created deployment.apps/kubernetes-dashboard created service/kubernetes-dashboard created === - Update NodePort === kubectl get svc dashboard-demo-kubernetes-dashboard -o yaml> dashboard-demo.yaml apiVersion: v1 kind: Service metadata: creationTimestamp: "2019-09-18T15:06:24Z" labels: app: kubernetes-dashboard chart: kubernetes-dashboard-1.9.0 heritage: Tiller kubernetes.io/cluster-service: "true" release: dashboard-demo name: dashboard-demo-kubernetes-dashboard namespace: default resourceVersion: "14839" selfLink: /api/v1/namespaces/default/services/dashboard-demo-kubernetes-dashboard uid: cc6a2ec2-831e-469a-94c4-e60c7d3b6bd8 spec: #ClusterIP <-------- delete ports: - name: https port: 443 protocol: TCP targetPort: 8443 selector: app: kubernetes-dashboard release: dashboard-demo sessionAffinity: None type: NodePort <----------- update status: loadBalancer: {} Apply new svc kubectl replace -f dashboard-demo.yaml --force === - Alternatives === Copy service yaml from the download dashboard yaml then update: cat dashboard_svc_update.yml # ------------------- Dashboard Service ------------------- # kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system spec: type: NodePort <------------- here ports: - port: 443 targetPort: 8443 selector: k8s-app: kubernetes-dashboard kubectl replace -f dashboard_svc_update.yml --force === - URL === https://: https://192.168.181.52:31938/#!/login === - Login === Check the download dashboard yaml for created service account and secrets. Show serviceaccout **namespace needed!**: kubectl get serviceaccounts -n kube-system NAME SECRETS AGE attachdetach-controller 1 3h24m bootstrap-signer 1 3h24m certificate-controller 1 3h24m clusterrole-aggregation-controller 1 3h24m coredns 1 3h24m cronjob-controller 1 3h24m daemon-set-controller 1 3h24m default 1 3h24m deployment-controller 1 3h24m disruption-controller 1 3h24m endpoint-controller 1 3h24m expand-controller 1 3h24m flannel 1 3h16m generic-garbage-collector 1 3h24m horizontal-pod-autoscaler 1 3h24m job-controller 1 3h24m kube-proxy 1 3h24m kubernetes-dashboard 1 49m <------------- here namespace-controller 1 3h24m Show the token. kubectl get secrets -n kube-system NAME TYPE DATA AGE kubernetes-dashboard-token-rtq8b kubernetes.io/service-account-token 3 52m ubuntu@onap-k8s-h0:~/k8s-dashboard$ kubectl describe secrets kubernetes-dashboard-token-rtq8b Name: default-token-8fx9x Namespace: default Labels: Annotations: kubernetes.io/service-account.name: default kubernetes.io/service-account.uid: 170242ce-df23-47df-8229-b54451e59423 Type: kubernetes.io/service-account-token Data ==== ca.crt: 1025 bytes namespace: 7 bytes token: <---------------- HERE ---------------- eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJkZWZhdWx0Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImRhc2hib2FyZC1kZW1vLWt1YmVybmV0ZXMtZGFzaGJvYXJkLXRva2VuLXoydGdqIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImRhc2hib2FyZC1kZW1vLWt1YmVybmV0ZXMtZGFzaGJvYXJkIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiMzQ1NDdkNWQtZWVkZS00ZjNlLWI0MWMtMjYyODQyMjg3NTM1Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50OmRlZmF1bHQ6ZGFzaGJvYXJkLWRlbW8ta3ViZXJuZXRlcy1kYXNoYm9hcmQifQ.Hk4Tka6zYPca63jAEs777br5Zr65ijot7Z175nEcNOYc-Q_YgplbhNBZhH5L7cLdT0Ht3Lpy4f67RJ5CWAa4ekXrOA-4LyI0HTLM1pnao00_lcoAx7GbZy67e3tYi9hxiLWutfEuoB70j2VUM0cZsXHyCmNdqJ5xiKwGYOGA3hoo0X_5ma5c-FVOKT5bp6sVM3-yyVnXMiU9aeWU3ZXc0jQ9Sz858B4XWlk5Q5NvA_1SH3em-ZA_Aw_5s3LGkWcY-yD2gcruNMX2VGIqIYPz_aJ4NE25-6-MnryjxdKno2BUXoTWEd8Uq85FRZLy0NuF5PCTglhcRuAN4ykHpCP3xA === - Update role to show all info (NOT NEEDED with the yaml downloaded above) === kubectl apply -f https://raw.githubusercontent.com/shdowofdeath/dashboard/master/rolebindingdashboard.yaml ==== - K8S dashboard Install using helm (DO THIS) ==== $ helm install stable/kubernetes-dashboard --name dashboard-demo If error try this: kubectl create serviceaccount --namespace kube-system tiller kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}' ubuntu@onap-k8s-h0:~$ helm install stable/kubernetes-dashboard --name dashboard-demo NAME: dashboard-demo LAST DEPLOYED: Wed Sep 11 09:32:31 2019 NAMESPACE: default STATUS: DEPLOYED RESOURCES: ==> v1/Deployment NAME READY UP-TO-DATE AVAILABLE AGE dashboard-demo-kubernetes-dashboard 0/1 1 0 0s ==> v1/Pod(related) NAME READY STATUS RESTARTS AGE dashboard-demo-kubernetes-dashboard-54557bff5-mqpfl 0/1 ContainerCreating 0 0s ==> v1/Secret NAME TYPE DATA AGE dashboard-demo-kubernetes-dashboard Opaque 0 0s ==> v1/Service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE dashboard-demo-kubernetes-dashboard ClusterIP 10.105.56.94 443/TCP 0s ==> v1/ServiceAccount NAME SECRETS AGE dashboard-demo-kubernetes-dashboard 1 0s ==> v1beta1/Role NAME AGE dashboard-demo-kubernetes-dashboard 0s ==> v1beta1/RoleBinding NAME AGE dashboard-demo-kubernetes-dashboard 0s NOTES: ********************************************************************************* *** PLEASE BE PATIENT: kubernetes-dashboard may take a few minutes to install *** ********************************************************************************* Get the Kubernetes Dashboard URL by running: export POD_NAME=$(kubectl get pods -n default -l "app=kubernetes-dashboard,release=dashboard-demo" -o jsonpath="{.items[0].metadata.name}") echo https://127.0.0.1:8443/ kubectl -n default port-forward $POD_NAME 8443:8443 **after installing with helm goto update nodepod to allow nodeport access** === - Error installing container with helm === Reproduction: Try installing helm install stable/kubernetes-dashboard --name dashboard-demo Hints: * https://github.com/helm/helm/issues/3055 Solution: kubectl create serviceaccount --namespace kube-system tiller kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}' === - Error Can not delete container === Pod/container can not be removed. kubectl describe pod -n kube-system tiller-deploy-86dfd7858f-lvhbl Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 9m35s default-scheduler Successfully assigned kube-system/tiller-deploy-86dfd7858f-lvhbl to onap-k8s-h2 Normal Pulling 9m34s kubelet, onap-k8s-h2 Pulling image "gcr.io/kubernetes-helm/tiller:canary" Warning Failed 9m19s kubelet, onap-k8s-h2 Failed to pull image "gcr.io/kubernetes-helm/tiller:canary": rpc error: code = Unknown desc = Error response from daemon: Get https://gcr.io/v2/kubernetes-helm/tiller/manifests/canary: Get https://gcr.io/v2/token?scope=repository%3Akubernetes-helm%2Ftiller%3Apull&service=gcr.io: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Warning Failed 9m19s kubelet, onap-k8s-h2 Error: ErrImagePull Normal SandboxChanged 6m48s (x11 over 9m18s) kubelet, onap-k8s-h2 Pod sandbox changed, it will be killed and re-created. Check docker log sudo journal -xfeu docker Sep 18 14:33:40 onap-k8s-h2 dockerd[10727]: time="2019-09-18T14:33:40.155714272Z" level=error msg="Handler for POST /containers/3289b49239ee457ef86bb48cd8794fa71ca7a7cb46990701c7f1edaa5df6f6b5/stop returned error: cannot stop container: 3289b49239ee457ef86bb48cd8794fa71ca7a7cb46990701c7f1edaa5df6f6b5: Cannot kill container 3289b49239ee457ef86bb48cd8794fa71ca7a7cb46990701c7f1edaa5df6f6b5: unknown error after kill: runc did not terminate sucessfully: container_linux.go:388: signaling init process caused \"permission denied\"\n: unknown" Problem is snap docker was installed while installing ubuntu. So all the apparmor stuffs was configured. Slution is to reinstall appArmor or do this: sudo aa-status sudo aa-remove-unknown === - Dashboard pod CrashLoopBackOff === Problem: dashboard try to write secret to kube-system but does not have permission. kubectl describe pods dashboard-demo-kubernetes-dashboard-54557bff5-qfnp2 kubectl logs dashboard-demo-kubernetes-dashboard-54557bff5-qfnp2 kubernetes-dashboard-container-name ... 2019/09/18 15:47:21 Storing encryption key in a secret panic: secrets is forbidden: User "system:serviceaccount:default:dashboard-demo-kubernetes-dashboard" cannot create resource "secrets" in API group "" in the namespace "kube-system" kubectl describe serviceaccounts dashboard-demo-kubernetes-dashboard Name: dashboard-demo-kubernetes-dashboard <--------- Here Namespace: default <-------------- here Labels: app=kubernetes-dashboard chart=kubernetes-dashboard-1.9.0 heritage=Tiller release=dashboard-demo Annotations: Image pull secrets: Mountable secrets: dashboard-demo-kubernetes-dashboard-token-z2tgj Tokens: dashboard-demo-kubernetes-dashboard-token-z2tgj Events: Solution: kubectl delete clusterolebinding kubernetes-dashboard (if exist) kubectl create clusterrolebinding kubernetes-dashboard --clusterrole=cluster-admin --serviceaccount=default:dashboard-demo-kubernetes-dashboard The equipevalent yaml: apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: kubernetes-dashboard labels: k8s-app: kubernetes-dashboard roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: dashboard-demo-kubernetes-dashboard <---- here namespace: default <---- here