腾讯云TKE-Pod案例: Pod处于Pedding状态-pv挂载超时

2020-11-27 16:57:30 浏览数 (1)

问题描述

在业务服务有更新镜像进行业务上线时, 会出现Pod 一直处于Pedding状态. 一直更新失败。

排查思路

  1. 先检查Pod 启动的阶段发生了什么问题:

kubectl describe po -n {namespace} 发现是挂在pv超时

`Unable to mount volumes for pod “xxx-test-xx-0_ns-prj57r7d-1091927-test(52bcf47a-2354-11eb-a92c-525400b26555)”: timeout expired waiting for volumes to attach or mount for pod “ns-prj57r7d-1091927-test”/“xxx-test-xx-0”. list of unmounted volumes=pretty. list of unattached volumes=pretty cgroup shm xx filebeatdata applogdata filebeatconfig default-token-lgrlv

  1. 在pod启动流程里,在pod启动先挂载pv,块存储的pv 会有2个动作一个是attach 一个mount, attach阶段是调用cbs 去挂载磁盘到node节点, kubectl get pv pvc-845bdb98-7a9f-4156-8aa2-8c7c08b65e90 -o yaml
代码语言:txt复制
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    Provisioner_Id: ""
    kubernetes.io/createdby: qcloud-cbs-dynamic-provisioner
    kubernetes.io/disk-delete-with-cluster-deletion: "true"
    pv.kubernetes.io/provisioned-by: cloud.tencent.com/qcloud-cbs
  creationTimestamp: "2020-07-30T07:14:41Z"
  finalizers:
  - kubernetes.io/pv-protection
  labels:
    failure-domain.beta.kubernetes.io/region: bj
    failure-domain.beta.kubernetes.io/zone: "800001"
  name: pvc-845bdb98-7a9f-4156-8aa2-8c7c08b65e90
  resourceVersion: "11784521"
  selfLink: /api/v1/persistentvolumes/pvc-845bdb98-7a9f-4156-8aa2-8c7c08b65e90
  uid: 1d9380a9-8218-4d28-a79c-e77921a929bb
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 20Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: cbs-test1-0
    namespace: default
    resourceVersion: "11784500"
    uid: 845bdb98-7a9f-4156-8aa2-8c7c08b65e90
  persistentVolumeReclaimPolicy: Delete
  qcloudCbs:
    cbsDiskId: disk-xx
  storageClassName: cbs
  volumeMode: Filesystem
status:
  phase: Bound
  1. 接下来正常流程会执行第二个阶段 ,查看mount信息: mount |grep disk-oeuba9ig 看看是否存在,不存在继续查看磁盘情况: cd /dev/disk/by-id
image.pngimage.png

查看磁盘是否有挂在成功

解决方案:

临时创建软链接ln -s /dev/vde /dev/disk/by-id/virtio-disk-xx

0 人点赞