Kubernetes Pod 突然就無法掛載 Ceph RBD 存儲卷了。。

該文章隨時會有校正更新,公眾號無法更新,歡迎訂閱博客查看最新內(nèi)容:https://fuckcloudnative.io
前言
Kubernetes 坑不坑?坑!Ceph 坑不坑?坑!他倆湊到一起呢?巨坑!
之前在 Kubernetes 集群中部署了高可用 Harbor 鏡像倉庫,并使用 Ceph RBD 提供持久化存儲。本來是挺美滋滋的,誰料昨天有一臺節(jié)點 NotReady 了,導致 Harbor 的某個組件所在的 Pod 被重新調(diào)度了,但是重新調(diào)度后的 Pod 并沒有啟動成功。
進一步通過 describe pod 查看 events,發(fā)現(xiàn)如下 Warning:
Events:
??Type?????Reason??????????????Age???From?????????????????????Message
??----?????------??????????????----??----?????????????????????-------
??Normal???Scheduled???????????23s???default-scheduler????????Successfully?assigned?harbor/harbor-harbor-registry-5796cdddd7-kxzp9?to?k8s03
??Warning??FailedAttachVolume??22s???attachdetach-controller??Multi-Attach?error?for?volume?"pvc-ec045b5e-2471-469d-9a1b-6e7db0e938b3"?Volume?is?already?exclusively?attached?to?one?node?and?can't?be?attached?to?another
好家伙,當前的 PV 所對應的 RBD image 還在被另一個 Pod 占用著,所以無法掛載到新 Pod 中。我到 NotReady 的節(jié)點中通過 docker rm -vf xxx 直接將之前的 Pod 刪除,仍然不起作用。
現(xiàn)在看來我只能從之前的 Pod 所在節(jié)點中將 RBD image 映射的塊設備強行 unmount 了。首先得找到該 PV 所對應的 RBD image,直接查看 PV 的信息:
????→?kubectl?-n?harbor?get?pv?pvc-ec045b5e-2471-469d-9a1b-6e7db0e938b3?-o?go-template='{{.spec.csi.volumeAttributes.imageName}}'
csi-vol-bf0dc641-4a5a-11eb-988c-6ab597a1411c
到 Ceph 管理節(jié)點中查看該 image 正在被誰使用:
????→?rbd?status?kubernetes/csi-vol-bf0dc641-4a5a-11eb-988c-6ab597a1411c
Watchers:
?watcher=172.16.7.1:0/3619044864?client.195600?cookie=18446462598732840980
找到了罪魁禍首,于是登錄到 172.16.7.1 將塊設備強行卸載:
????→?docker?ps|grep?csi
77255fe4f26b????????650757c4f32d??????????????????"/usr/local/bin/ceph…"???3?weeks?ago?????????Up?3?weeks??????????????????????????????k8s_liveness-prometheus_csi-rbdplugin-hscf8_ceph-csi_2b7da817-3f4a-4e8f-9f99-a39da07c5b94_5
fb4e5e10f064????????650757c4f32d??????????????????"/usr/local/bin/ceph…"???3?weeks?ago?????????Up?3?weeks??????????????????????????????k8s_csi-rbdplugin_csi-rbdplugin-hscf8_ceph-csi_2b7da817-3f4a-4e8f-9f99-a39da07c5b94_5
5330c84529e9????????37c1d9ea538b??????????????????"/csi-node-driver-re…"???3?weeks?ago?????????Up?3?weeks??????????????????????????????k8s_driver-registrar_csi-rbdplugin-hscf8_ceph-csi_2b7da817-3f4a-4e8f-9f99-a39da07c5b94_6
4452755ffccf????????k8s.gcr.io/pause:3.2??????????"/pause"?????????????????3?weeks?ago?????????Up?3?weeks??????????????????????????????k8s_POD_csi-rbdplugin-hscf8_ceph-csi_2b7da817-3f4a-4e8f-9f99-a39da07c5b94_5
????→?docker?exec?-it?fb4e5e10f064?bash
[root@k8s01?/]#?rbd?showmapped|grep?csi-vol-bf0dc641-4a5a-11eb-988c-6ab597a1411c
4???kubernetes?????????????csi-vol-bf0dc641-4a5a-11eb-988c-6ab597a1411c??-?????/dev/rbd4
[root@k8s01?/]#?rbd?unmap?-o?force?/dev/rbd4
現(xiàn)在在來看新 Pod,已經(jīng)啟動成功了:
Events:
??Type?????Reason??????????????????Age???????????????????From?????????????????????Message
??----?????------??????????????????----??????????????????----?????????????????????-------
??Normal???Scheduled???????????????18m???????????????????default-scheduler????????Successfully?assigned?harbor/harbor-harbor-registry-5796cdddd7-kxzp9?to?k8s03
??Warning??FailedAttachVolume??????18m???????????????????attachdetach-controller??Multi-Attach?error?for?volume?"pvc-ec045b5e-2471-469d-9a1b-6e7db0e938b3"?Volume?is?already?exclusively?attached?to?one?node?and?can't?be?attached?to?another
??Warning??FailedMount?????????????14m???????????????????kubelet,?k8s03???????????Unable?to?attach?or?mount?volumes:?unmounted?volumes=[registry-data],?unattached?volumes=[default-token-phjbz?registry-data?registry-root-certificate?registry-htpasswd?registry-config]:?timed?out?waiting?for?the?condition
??Normal???SuccessfulAttachVolume??12m???????????????????attachdetach-controller??AttachVolume.Attach?succeeded?for?volume?"pvc-ec045b5e-2471-469d-9a1b-6e7db0e938b3"
??Warning??FailedMount?????????????12m???????????????????kubelet,?k8s03???????????Unable?to?attach?or?mount?volumes:?unmounted?volumes=[registry-data],?unattached?volumes=[registry-htpasswd?registry-config?default-token-phjbz?registry-data?registry-root-certificate]:?timed?out?waiting?for?the?condition
??Warning??FailedMount?????????????5m21s?(x2?over?16m)???kubelet,?k8s03???????????Unable?to?attach?or?mount?volumes:?unmounted?volumes=[registry-data],?unattached?volumes=[registry-config?default-token-phjbz?registry-data?registry-root-certificate?registry-htpasswd]:?timed?out?waiting?for?the?condition
??Warning??FailedMount?????????????3m5s?(x2?over?9m55s)??kubelet,?k8s03???????????Unable?to?attach?or?mount?volumes:?unmounted?volumes=[registry-data],?unattached?volumes=[registry-root-certificate?registry-htpasswd?registry-config?default-token-phjbz?registry-data]:?timed?out?waiting?for?the?condition
??Warning??FailedMount?????????????2m54s?(x9?over?11m)???kubelet,?k8s03???????????MountVolume.MountDevice?failed?for?volume?"pvc-ec045b5e-2471-469d-9a1b-6e7db0e938b3"?:?rpc?error:?code?=?Internal?desc?=?rbd?image?kubernetes/csi-vol-bf0dc641-4a5a-11eb-988c-6ab597a1411c?is?still?being?used
??Warning??FailedMount?????????????50s?(x2?over?7m39s)???kubelet,?k8s03???????????Unable?to?attach?or?mount?volumes:?unmounted?volumes=[registry-data],?unattached?volumes=[registry-data?registry-root-certificate?registry-htpasswd?registry-config?default-token-phjbz]:?timed?out?waiting?for?the?condition
??Normal???Pulling?????????????????15s???????????????????kubelet,?k8s03???????????Pulling?image?"goharbor/registry-photon:v2.1.2"
??Normal???Pulled??????????????????12s???????????????????kubelet,?k8s03???????????Successfully?pulled?image?"goharbor/registry-photon:v2.1.2"
??Normal???Created?????????????????12s???????????????????kubelet,?k8s03???????????Created?container?registry
??Normal???Started?????????????????12s???????????????????kubelet,?k8s03???????????Started?container?registry


你可能還喜歡
點擊下方圖片即可閱讀

云原生是一種信仰???
掃碼關注公眾號
后臺回復?k8s?獲取史上最方便快捷的 Kubernetes 高可用部署工具,只需一條命令,連 ssh 都不需要!


點擊?"閱讀原文"?獲取更好的閱讀體驗!
??給個「在看」,是對我最大的支持??
評論
圖片
表情

