2023. 4. 2. 23:56ㆍ카테고리 없음
[작성목적]
해당 작성내용은 'ImagePullBackOff' 라는 문제를 직접 해결하기 위한 작성내용이 아니라,
쿠버네티스 파드 디버깅 목적으로 특정 파드의 상태가 'Running'이 아닌 비정상적인 상태일 경우 왜 그 상태인지 원인을 파악하는 과정을 참고하기 위함.
참고로 ,해당 글을 작성하게된 직접적인 계기인 대시보드가 안 뜨는 문제는 고정IP부터 재세팅하여 해결
[문제상황]
- 쿠버네티스 master노드 로컬환경 내에서 대시보드를 띄우는데 웹브라우저 503에러 발생
- 대시보드 파드상태를 확인하니 상태가 'Running' 상태가 아닌 'ImagePullBackOff' 상태임을 확인
- 그렇다면, 'ImagePullBackOff' 의 발생원인은 무엇인가?
[확인절차]
1. master노드 로컬환경 내에서 대시보드를 띄우는데 웹브라우저 503에러 발생
2. 파드 내 '대시보드' 관련 파드명을 확인하기 위해 파드 리스트 확인
-> 현재 문제가 있는 대시보드가 'ImagePullBackOff' 상태임을 확인
# 모든 파드 상태 확인
$ kubectl get pods --all-namespaces
특정 파드의 상태확인 (필자의 경우 대시보드 문제로 대시보드만 상태확인..)
# 특정 파드상태 확인
# kubectl -n {namespace} get pods
$ kubectl -n kubernetes-dashboard get pods
3. 특정 파드의 상세 상태내용 확인 ('ImagePullBackOff'발생원인을 파악하기 위함)
-> 발생원인이 특정 이미지를 pull 하지 못함으로 인지
'dashboard-metrics-scraper-7bc864c59-c67sp' : Back-off pulling image "kubernetesui/metrics-scraper:v1.0.8"
'kubernetes-dashboard-6ff574dd47-cfn4z' : Back-off pulling image "kubernetesui/dashboard:v2.6.1"
+ 필자의 경우 대시보드관련 파드가 2개(왜 2개인지까진 모름..), 2개여서 2개다 상태 확인
# 'dashboard-metrics-scraper-7bc864c59-c67sp' 상태 관련 상세확인
# kubectl descripbe pod {POD ID} -n {namespace}
$ kubectl describe pod dashboard-metrics-scraper-7bc864c59-c67sp -n kubernetes-dashboard
# 'kubernetes-dashboard-6ff574dd47-cfn4z' 상태 관련 상세확인
# kubectl descripbe pod {POD ID} -n {namespace}
$ kubectl describe pod kubernetes-dashboard-6ff574dd47-cfn4z -n kubernetes-dashboard
(2개 파드 상세내용)
zin@master:~/Desktop$ kubectl -n kubernetes-dashboard get pods
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-7bc864c59-c67sp 0/1 ImagePullBackOff 0 11h
kubernetes-dashboard-6ff574dd47-cfn4z 0/1 ImagePullBackOff 0 11h
zin@master:~/Desktop$ kubectl describe pod dashboard-metrics-scraper-7bc864c59-c67sp -n kubernetes-dashboard
Name: dashboard-metrics-scraper-7bc864c59-c67sp
Namespace: kubernetes-dashboard
Priority: 0
Service Account: kubernetes-dashboard
Node: worker1/192.168.56.102
Start Time: Sat, 01 Apr 2023 00:04:54 +0900
Labels: k8s-app=dashboard-metrics-scraper
pod-template-hash=7bc864c59
Annotations: <none>
Status: Pending
IP: 10.244.1.2
IPs:
IP: 10.244.1.2
Controlled By: ReplicaSet/dashboard-metrics-scraper-7bc864c59
Containers:
dashboard-metrics-scraper:
Container ID:
Image: kubernetesui/metrics-scraper:v1.0.8
Image ID:
Port: 8000/TCP
Host Port: 0/TCP
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Liveness: http-get http://:8000/ delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/tmp from tmp-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zn5zk (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
tmp-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-zn5zk:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: kubernetes.io/os=linux
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 11h default-scheduler 0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 node(s) had untolerated taint {node.kubernetes.io/unreachable: }. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
Normal Scheduled 11h default-scheduler Successfully assigned kubernetes-dashboard/dashboard-metrics-scraper-7bc864c59-c67sp to worker1
Warning FailedCreatePodSandBox 11h kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "d730b6c289d9160a4c49766c40ae0edbfbb5999b522335196f1e90f6c7bafd51": plugin type="flannel" failed (add): loadFlannelSubnetEnv failed: open /run/flannel/subnet.env: no such file or directory
Warning Failed 11h kubelet Failed to pull image "kubernetesui/metrics-scraper:v1.0.8": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to resolve reference "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/metrics-scraper/manifests/v1.0.8": dial tcp 52.1.184.176:443: connect: connection reset by peer
Warning Failed 11h kubelet Failed to pull image "kubernetesui/metrics-scraper:v1.0.8": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to resolve reference "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/metrics-scraper/manifests/v1.0.8": write tcp 10.0.2.5:39302->34.194.164.123:443: write: connection reset by peer
Warning Failed 11h kubelet Failed to pull image "kubernetesui/metrics-scraper:v1.0.8": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to resolve reference "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/metrics-scraper/manifests/v1.0.8": read tcp 10.0.2.5:36830->34.194.164.123:443: read: connection reset by peer
Normal Pulling 11h (x4 over 11h) kubelet Pulling image "kubernetesui/metrics-scraper:v1.0.8"
Warning Failed 11h (x4 over 11h) kubelet Error: ErrImagePull
Warning Failed 11h kubelet Failed to pull image "kubernetesui/metrics-scraper:v1.0.8": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to resolve reference "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/metrics-scraper/manifests/v1.0.8": read tcp 10.0.2.5:46308->52.1.184.176:443: read: connection reset by peer
Warning Failed 11h (x6 over 11h) kubelet Error: ImagePullBackOff
Normal BackOff 3m35s (x215 over 11h) kubelet Back-off pulling image "kubernetesui/metrics-scraper:v1.0.8"
zin@master:~/Desktop$ kubectl describe pod kubernetes-dashboard-6ff574dd47-cfn4z -n kubernetes-dashboard
Name: kubernetes-dashboard-6ff574dd47-cfn4z
Namespace: kubernetes-dashboard
Priority: 0
Service Account: kubernetes-dashboard
Node: worker1/192.168.56.102
Start Time: Sat, 01 Apr 2023 00:04:55 +0900
Labels: k8s-app=kubernetes-dashboard
pod-template-hash=6ff574dd47
Annotations: <none>
Status: Pending
IP: 10.244.1.3
IPs:
IP: 10.244.1.3
Controlled By: ReplicaSet/kubernetes-dashboard-6ff574dd47
Containers:
kubernetes-dashboard:
Container ID:
Image: kubernetesui/dashboard:v2.6.1
Image ID:
Port: 8443/TCP
Host Port: 0/TCP
Args:
--auto-generate-certificates
--namespace=kubernetes-dashboard
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Liveness: http-get https://:8443/ delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/certs from kubernetes-dashboard-certs (rw)
/tmp from tmp-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-bz4m4 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kubernetes-dashboard-certs:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-certs
Optional: false
tmp-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-bz4m4:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: kubernetes.io/os=linux
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 11h default-scheduler 0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 node(s) had untolerated taint {node.kubernetes.io/unreachable: }. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
Normal Scheduled 11h default-scheduler Successfully assigned kubernetes-dashboard/kubernetes-dashboard-6ff574dd47-cfn4z to worker1
Warning FailedCreatePodSandBox 11h kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "3833f9d128546325bfcbb78e717bd735801ed8a61a8fcb86418c01509137290f": plugin type="flannel" failed (add): loadFlannelSubnetEnv failed: open /run/flannel/subnet.env: no such file or directory
Warning Failed 11h kubelet Failed to pull image "kubernetesui/dashboard:v2.6.1": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/dashboard:v2.6.1": failed to resolve reference "docker.io/kubernetesui/dashboard:v2.6.1": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/dashboard/manifests/v2.6.1": write tcp 10.0.2.5:60322->34.194.164.123:443: write: connection reset by peer
Warning Failed 11h kubelet Failed to pull image "kubernetesui/dashboard:v2.6.1": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/dashboard:v2.6.1": failed to resolve reference "docker.io/kubernetesui/dashboard:v2.6.1": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/dashboard/manifests/v2.6.1": write tcp 10.0.2.5:52072->34.194.164.123:443: write: connection reset by peer
Warning Failed 11h kubelet Failed to pull image "kubernetesui/dashboard:v2.6.1": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/dashboard:v2.6.1": failed to resolve reference "docker.io/kubernetesui/dashboard:v2.6.1": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/dashboard/manifests/v2.6.1": write tcp 10.0.2.5:36826->34.194.164.123:443: write: connection reset by peer
Normal Pulling 11h (x4 over 11h) kubelet Pulling image "kubernetesui/dashboard:v2.6.1"
Warning Failed 11h (x4 over 11h) kubelet Error: ErrImagePull
Warning Failed 11h kubelet Failed to pull image "kubernetesui/dashboard:v2.6.1": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/dashboard:v2.6.1": failed to resolve reference "docker.io/kubernetesui/dashboard:v2.6.1": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/dashboard/manifests/v2.6.1": write tcp 10.0.2.5:35896->34.194.164.123:443: write: connection reset by peer
Warning Failed 11h (x6 over 11h) kubelet Error: ImagePullBackOff
Normal BackOff 4m26s (x215 over 11h) kubelet Back-off pulling image "kubernetesui/dashboard:v2.6.1"
zin@master:~/Desktop$
4. 이미지 존재여부 확인
-> 이미지 없음을 확인완료
# 이미지 존재여부 확인
$ sudo docker images | grep kubernetesui
5. (문제해결을 위한 직접적인 해결책으로 인지)해당 이미지 직접 pull 시도
-> pull실행시 'read: connection reset by peer' 오류 발생 (이 오류를 직접적으로 해결하진 못함..)