[쿠버네티스] 쿠버네티스 디버깅 _ 파드디버깅_파드의상태가 Running이 아닌 경우 원인 파악 _ ImagePullBackOff 발생

2023. 4. 2. 23:56카테고리 없음

728x90

[작성목적]

해당 작성내용은 'ImagePullBackOff' 라는 문제를 직접 해결하기 위한 작성내용이 아니라, 

쿠버네티스 파드 디버깅 목적으로 특정 파드의 상태가 'Running'이 아닌 비정상적인 상태일 경우 왜 그 상태인지 원인을 파악하는 과정을 참고하기 위함.

참고로 ,해당 글을 작성하게된 직접적인 계기인 대시보드가 안 뜨는 문제는 고정IP부터 재세팅하여 해결

 


[문제상황]

  • 쿠버네티스 master노드 로컬환경 내에서 대시보드를 띄우는데 웹브라우저 503에러 발생
  • 대시보드 파드상태를 확인하니 상태가 'Running' 상태가 아닌 'ImagePullBackOff' 상태임을 확인
  • 그렇다면, 'ImagePullBackOff' 의 발생원인은 무엇인가?

 


[확인절차]

 

1.  master노드 로컬환경 내에서 대시보드를 띄우는데 웹브라우저 503에러 발생

 

2.  파드 내 '대시보드' 관련 파드명을 확인하기 위해 파드 리스트 확인 

     -> 현재 문제가 있는 대시보드가 'ImagePullBackOff' 상태임을 확인

# 모든 파드 상태 확인
$ kubectl get pods --all-namespaces

 

특정 파드의 상태확인 (필자의 경우 대시보드 문제로 대시보드만 상태확인..)

# 특정 파드상태 확인
# kubectl -n {namespace} get pods 
$ kubectl -n kubernetes-dashboard get pods

 

 

3. 특정 파드의 상세 상태내용 확인 ('ImagePullBackOff'발생원인을 파악하기 위함)

    -> 발생원인이  특정 이미지를 pull 하지 못함으로 인지 

        'dashboard-metrics-scraper-7bc864c59-c67sp' : Back-off pulling image "kubernetesui/metrics-scraper:v1.0.8"

        'kubernetes-dashboard-6ff574dd47-cfn4z'         : Back-off pulling image "kubernetesui/dashboard:v2.6.1"

        + 필자의 경우 대시보드관련 파드가 2개(왜 2개인지까진 모름..), 2개여서 2개다 상태 확인

# 'dashboard-metrics-scraper-7bc864c59-c67sp' 상태 관련 상세확인
# kubectl descripbe pod {POD ID} -n {namespace}
$ kubectl describe pod dashboard-metrics-scraper-7bc864c59-c67sp -n kubernetes-dashboard

# 'kubernetes-dashboard-6ff574dd47-cfn4z' 상태 관련 상세확인
# kubectl descripbe pod {POD ID} -n {namespace}
$ kubectl describe pod kubernetes-dashboard-6ff574dd47-cfn4z -n kubernetes-dashboard

(2개 파드 상세내용)

zin@master:~/Desktop$ kubectl -n  kubernetes-dashboard get pods
NAME                                        READY   STATUS             RESTARTS   AGE
dashboard-metrics-scraper-7bc864c59-c67sp   0/1     ImagePullBackOff   0          11h
kubernetes-dashboard-6ff574dd47-cfn4z       0/1     ImagePullBackOff   0          11h
zin@master:~/Desktop$ kubectl describe pod dashboard-metrics-scraper-7bc864c59-c67sp -n kubernetes-dashboard
Name:             dashboard-metrics-scraper-7bc864c59-c67sp
Namespace:        kubernetes-dashboard
Priority:         0
Service Account:  kubernetes-dashboard
Node:             worker1/192.168.56.102
Start Time:       Sat, 01 Apr 2023 00:04:54 +0900
Labels:           k8s-app=dashboard-metrics-scraper
                  pod-template-hash=7bc864c59
Annotations:      <none>
Status:           Pending
IP:               10.244.1.2
IPs:
  IP:           10.244.1.2
Controlled By:  ReplicaSet/dashboard-metrics-scraper-7bc864c59
Containers:
  dashboard-metrics-scraper:
    Container ID:   
    Image:          kubernetesui/metrics-scraper:v1.0.8
    Image ID:       
    Port:           8000/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Liveness:       http-get http://:8000/ delay=30s timeout=30s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /tmp from tmp-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zn5zk (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  tmp-volume:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  kube-api-access-zn5zk:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 node-role.kubernetes.io/master:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                    From               Message
  ----     ------                  ----                   ----               -------
  Warning  FailedScheduling        11h                    default-scheduler  0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 node(s) had untolerated taint {node.kubernetes.io/unreachable: }. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
  Normal   Scheduled               11h                    default-scheduler  Successfully assigned kubernetes-dashboard/dashboard-metrics-scraper-7bc864c59-c67sp to worker1
  Warning  FailedCreatePodSandBox  11h                    kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "d730b6c289d9160a4c49766c40ae0edbfbb5999b522335196f1e90f6c7bafd51": plugin type="flannel" failed (add): loadFlannelSubnetEnv failed: open /run/flannel/subnet.env: no such file or directory
  Warning  Failed                  11h                    kubelet            Failed to pull image "kubernetesui/metrics-scraper:v1.0.8": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to resolve reference "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/metrics-scraper/manifests/v1.0.8": dial tcp 52.1.184.176:443: connect: connection reset by peer
  Warning  Failed                  11h                    kubelet            Failed to pull image "kubernetesui/metrics-scraper:v1.0.8": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to resolve reference "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/metrics-scraper/manifests/v1.0.8": write tcp 10.0.2.5:39302->34.194.164.123:443: write: connection reset by peer
  Warning  Failed                  11h                    kubelet            Failed to pull image "kubernetesui/metrics-scraper:v1.0.8": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to resolve reference "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/metrics-scraper/manifests/v1.0.8": read tcp 10.0.2.5:36830->34.194.164.123:443: read: connection reset by peer
  Normal   Pulling                 11h (x4 over 11h)      kubelet            Pulling image "kubernetesui/metrics-scraper:v1.0.8"
  Warning  Failed                  11h (x4 over 11h)      kubelet            Error: ErrImagePull
  Warning  Failed                  11h                    kubelet            Failed to pull image "kubernetesui/metrics-scraper:v1.0.8": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to resolve reference "docker.io/kubernetesui/metrics-scraper:v1.0.8": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/metrics-scraper/manifests/v1.0.8": read tcp 10.0.2.5:46308->52.1.184.176:443: read: connection reset by peer
  Warning  Failed                  11h (x6 over 11h)      kubelet            Error: ImagePullBackOff
  Normal   BackOff                 3m35s (x215 over 11h)  kubelet            Back-off pulling image "kubernetesui/metrics-scraper:v1.0.8"
zin@master:~/Desktop$ kubectl describe pod kubernetes-dashboard-6ff574dd47-cfn4z -n kubernetes-dashboard
Name:             kubernetes-dashboard-6ff574dd47-cfn4z
Namespace:        kubernetes-dashboard
Priority:         0
Service Account:  kubernetes-dashboard
Node:             worker1/192.168.56.102
Start Time:       Sat, 01 Apr 2023 00:04:55 +0900
Labels:           k8s-app=kubernetes-dashboard
                  pod-template-hash=6ff574dd47
Annotations:      <none>
Status:           Pending
IP:               10.244.1.3
IPs:
  IP:           10.244.1.3
Controlled By:  ReplicaSet/kubernetes-dashboard-6ff574dd47
Containers:
  kubernetes-dashboard:
    Container ID:  
    Image:         kubernetesui/dashboard:v2.6.1
    Image ID:      
    Port:          8443/TCP
    Host Port:     0/TCP
    Args:
      --auto-generate-certificates
      --namespace=kubernetes-dashboard
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Liveness:       http-get https://:8443/ delay=30s timeout=30s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /certs from kubernetes-dashboard-certs (rw)
      /tmp from tmp-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-bz4m4 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kubernetes-dashboard-certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  kubernetes-dashboard-certs
    Optional:    false
  tmp-volume:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  kube-api-access-bz4m4:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 node-role.kubernetes.io/master:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                    From               Message
  ----     ------                  ----                   ----               -------
  Warning  FailedScheduling        11h                    default-scheduler  0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 node(s) had untolerated taint {node.kubernetes.io/unreachable: }. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
  Normal   Scheduled               11h                    default-scheduler  Successfully assigned kubernetes-dashboard/kubernetes-dashboard-6ff574dd47-cfn4z to worker1
  Warning  FailedCreatePodSandBox  11h                    kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "3833f9d128546325bfcbb78e717bd735801ed8a61a8fcb86418c01509137290f": plugin type="flannel" failed (add): loadFlannelSubnetEnv failed: open /run/flannel/subnet.env: no such file or directory
  Warning  Failed                  11h                    kubelet            Failed to pull image "kubernetesui/dashboard:v2.6.1": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/dashboard:v2.6.1": failed to resolve reference "docker.io/kubernetesui/dashboard:v2.6.1": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/dashboard/manifests/v2.6.1": write tcp 10.0.2.5:60322->34.194.164.123:443: write: connection reset by peer
  Warning  Failed                  11h                    kubelet            Failed to pull image "kubernetesui/dashboard:v2.6.1": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/dashboard:v2.6.1": failed to resolve reference "docker.io/kubernetesui/dashboard:v2.6.1": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/dashboard/manifests/v2.6.1": write tcp 10.0.2.5:52072->34.194.164.123:443: write: connection reset by peer
  Warning  Failed                  11h                    kubelet            Failed to pull image "kubernetesui/dashboard:v2.6.1": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/dashboard:v2.6.1": failed to resolve reference "docker.io/kubernetesui/dashboard:v2.6.1": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/dashboard/manifests/v2.6.1": write tcp 10.0.2.5:36826->34.194.164.123:443: write: connection reset by peer
  Normal   Pulling                 11h (x4 over 11h)      kubelet            Pulling image "kubernetesui/dashboard:v2.6.1"
  Warning  Failed                  11h (x4 over 11h)      kubelet            Error: ErrImagePull
  Warning  Failed                  11h                    kubelet            Failed to pull image "kubernetesui/dashboard:v2.6.1": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/dashboard:v2.6.1": failed to resolve reference "docker.io/kubernetesui/dashboard:v2.6.1": failed to do request: Head "https://registry-1.docker.io/v2/kubernetesui/dashboard/manifests/v2.6.1": write tcp 10.0.2.5:35896->34.194.164.123:443: write: connection reset by peer
  Warning  Failed                  11h (x6 over 11h)      kubelet            Error: ImagePullBackOff
  Normal   BackOff                 4m26s (x215 over 11h)  kubelet            Back-off pulling image "kubernetesui/dashboard:v2.6.1"
zin@master:~/Desktop$

 

 

4. 이미지 존재여부 확인

    -> 이미지 없음을 확인완료 

# 이미지 존재여부 확인
$ sudo docker images | grep kubernetesui

 

5.  (문제해결을 위한 직접적인 해결책으로 인지)해당 이미지 직접 pull 시도 

     -> pull실행시 'read: connection reset by peer' 오류 발생 (이 오류를 직접적으로 해결하진 못함..)

 

 

참고URL1: https://m.blog.naver.com/onlywin7788/221845944242

참고URL1: https://github.com/kubernetes/minikube/issues/15424

728x90