Rook 1.5.1 部署 Ceph 實操經驗分享
一、Rook概述
1.1 Rook簡介
Rook 是一個開源的cloud-native storage編排, 提供平臺和框架;為各種存儲解決方案提供平臺、框架和支持,以便與云原生環(huán)境本地集成。目前主要專用于Cloud-Native環(huán)境的文件、塊、對象存儲服務。它實現(xiàn)了一個自我管理的、自我擴容的、自我修復的分布式存儲服務。
Rook支持自動部署、啟動、配置、分配(provisioning)、擴容/縮容、升級、遷移、災難恢復、監(jiān)控,以及資源管理。為了實現(xiàn)所有這些功能,Rook依賴底層的容器編排平臺,例如 kubernetes、CoreOS 等。。
Rook 目前支持Ceph、NFS、Minio Object Store、Edegefs、Cassandra、CockroachDB 存儲的搭建。
項目地址:https://github.com/rook/rook
網站:https://rook.io/
1.2 Rook組件
Rook的主要組件有三個,功能如下:
Rook Operator
Rook與Kubernetes交互的組件 整個Rook集群只有一個 Agent or Driver
已經被淘汰的驅動方式,在安裝之前,請確保k8s集群版本是否支持CSI,如果不支持,或者不想用CSI,選擇flex.
默認全部節(jié)點安裝,你可以通過 node affinity 去指定節(jié)點
Ceph CSI Driver Flex Driver Device discovery
發(fā)現(xiàn)新設備是否作為存儲,可以在配置文件
ROOK_ENABLE_DISCOVERY_DAEMON設置 enable 開啟。
1.3 Rook & Ceph框架
Rook 如何集成在kubernetes 如圖:
使用Rook部署Ceph集群的架構圖如下:
部署的Ceph系統(tǒng)可以提供下面三種Volume Claim服務:
Block Storage:目前最穩(wěn)定; FileSystem:需要部署MDS,有內核要求; Object:部署RGW;
二、ROOK 部署
2.1 準備工作
2.1.1 版本要求
kubernetes v.11 以上
2.1.2 存儲要求
rook部署的ceph 是不支持lvm direct直接作為osd存儲設備的,如果想要使用lvm,可以使用pvc的形式實現(xiàn)。方法在后面的ceph安裝會提到
為了配置 Ceph 存儲集群,至少需要以下本地存儲選項之一:
原始設備(無分區(qū)或格式化的文件系統(tǒng)) 原始分區(qū)(無格式文件系統(tǒng)) 可以 lsblk -f查看,如果 FSTYPE不為空說明有文件系統(tǒng) 可通過 block 模式從存儲類別獲得 PV
2.1.3 系統(tǒng)要求
本次安裝環(huán)境
kubernetes 1.18 centos7.8 kernel 5.4.65-200.el7.x86_64 calico 3.16
2.1.3.1 需要安裝lvm包
sudo?yum?install?-y?lvm2
2.1.3.2 內核要求
RBD
一般發(fā)行版的內核都編譯有,但你最好確定下:
foxchan@~$?lsmod|grep?rbd
rbd???????????????????114688??0?
libceph???????????????368640??1?rbd
可以用以下命令放到開機啟動項里
cat?>?/etc/sysconfig/modules/rbd.modules?<modprobe?rbd
EOF
CephFS
如果你想使用cephfs,內核最低要求是4.17。
2.2 部署ROOK
Github上下載Rook最新release
git?clone?--single-branch?--branch?v1.5.1?https://github.com/rook/rook.gits
安裝公共部分
cd?rook/cluster/examples/kubernetes/ceph
kubectl?create?-f?crds.yaml?-f?common.yaml
安裝operator
kubectl?apply?-f?operator.yaml
如果放到生產環(huán)境,請?zhí)崆耙?guī)劃好。operator的配置在ceph安裝后不能修改,否則rook會刪除集群并重建。
修改內容如下:
#?啟用cephfs?
ROOK_CSI_ENABLE_CEPHFS:?"true"
#?開啟內核驅動替換ceph-fuse
CSI_FORCE_CEPHFS_KERNEL_CLIENT:?"true"
#修改csi鏡像為私有倉,加速部署時間
ROOK_CSI_CEPH_IMAGE:?"harbor.foxchan.com/google_containers/cephcsi/cephcsi:v3.1.2"
ROOK_CSI_REGISTRAR_IMAGE:?"harbor.foxchan.com/google_containers/k8scsi/csi-node-driver-registrar:v2.0.1"
ROOK_CSI_RESIZER_IMAGE:?"harbor.foxchan.com/google_containers/k8scsi/csi-resizer:v1.0.0"
ROOK_CSI_PROVISIONER_IMAGE:?"harbor.foxchan.com/google_containers/k8scsi/csi-provisioner:v2.0.0"
ROOK_CSI_SNAPSHOTTER_IMAGE:?"harbor.foxchan.com/google_containers/k8scsi/csi-snapshotter:v3.0.0"
ROOK_CSI_ATTACHER_IMAGE:?"harbor.foxchan.com/google_containers/k8scsi/csi-attacher:v3.0.0"
#?可以設置NODE_AFFINITY?來指定csi?部署的節(jié)點
#?我把plugin 和 provisioner分開了,具體調度方式看你集群資源。
CSI_PROVISIONER_NODE_AFFINITY:?"app.rook.role=csi-provisioner"
CSI_PLUGIN_NODE_AFFINITY:?"app.rook.plugin=csi"
#修改metrics端口,可以不改,我因為集群網絡是host,為了避免端口沖突
#?Configure?CSI?CSI?Ceph?FS?grpc?and?liveness?metrics?port
CSI_CEPHFS_GRPC_METRICS_PORT:?"9491"
CSI_CEPHFS_LIVENESS_METRICS_PORT:?"9481"
#?Configure?CSI?RBD?grpc?and?liveness?metrics?port
CSI_RBD_GRPC_METRICS_PORT:?"9490"
CSI_RBD_LIVENESS_METRICS_PORT:?"9480"
#?修改rook鏡像,加速部署時間
image:?harbor.foxchan.com/google_containers/rook/ceph:v1.5.1
#?指定節(jié)點做存儲
????????-?name:?DISCOVER_AGENT_NODE_AFFINITY
??????????value:?"app.rook=storage"
#?開啟設備自動發(fā)現(xiàn)
????????-?name:?ROOK_ENABLE_DISCOVERY_DAEMON
??????????value:?"true"
2.3 部署ceph集群
cluster.yaml文件里的內容需要修改,一定要適配自己的硬件情況,請詳細閱讀配置文件里的注釋,避免我踩過的坑。
修改內容如下:
此文件的配置,除了增刪osd設備外,其他的修改都要重裝ceph集群才能生效,所以請?zhí)崆耙?guī)劃好集群。如果修改后不卸載ceph直接apply,會觸發(fā)ceph集群重裝,導致集群異常掛掉
apiVersion:?ceph.rook.io/v1
kind:?CephCluster
metadata:
#?命名空間的名字,同一個命名空間只支持一個集群
??name:?rook-ceph
??namespace:?rook-ceph
spec:
#?ceph版本說明
#?v13?is?mimic,?v14?is?nautilus,?and?v15?is?octopus.
??cephVersion:
#修改ceph鏡像,加速部署時間
????image:?harbor.foxchan.com/google_containers/ceph/ceph:v15.2.5
#?是否允許不支持的ceph版本
????allowUnsupported:?false
#指定rook數(shù)據(jù)在節(jié)點的保存路徑
??dataDirHostPath:?/data/rook
#?升級時如果檢查失敗是否繼續(xù)
??skipUpgradeChecks:?false
#?從1.5開始,mon的數(shù)量必須是奇數(shù)
??mon:
????count:?3
#?是否允許在單個節(jié)點上部署多個mon?pod
????allowMultiplePerNode:?false
??mgr:
????modules:
????-?name:?pg_autoscaler
??????enabled:?true
#?開啟dashboard,禁用ssl,指定端口是7000,你可以默認https配置。我是為了ingress配置省事。
??dashboard:
????enabled:?true
????port:?7000
????ssl:?false
#?開啟prometheusRule
??monitoring:
????enabled:?true
#?部署PrometheusRule的命名空間,默認此CR所在命名空間
????rulesNamespace:?rook-ceph
#?開啟網絡為host模式,解決無法使用cephfs?pvc的bug
??network:
????provider:?host
#?開啟crash?collector,每個運行了Ceph守護進程的節(jié)點上創(chuàng)建crash?collector?pod
??crashCollector:
????disable:?false
#?設置node親緣性,指定節(jié)點安裝對應組件
??placement:
????mon:
??????nodeAffinity:
????????requiredDuringSchedulingIgnoredDuringExecution:
??????????nodeSelectorTerms:
??????????-?matchExpressions:
????????????-?key:?ceph-mon
??????????????operator:?In
??????????????values:
??????????????-?enabled
????osd:
??????nodeAffinity:
????????requiredDuringSchedulingIgnoredDuringExecution:
??????????nodeSelectorTerms:
??????????-?matchExpressions:
????????????-?key:?ceph-osd
??????????????operator:?In
??????????????values:
??????????????-?enabled
????mgr:
??????nodeAffinity:
????????requiredDuringSchedulingIgnoredDuringExecution:
??????????nodeSelectorTerms:
??????????-?matchExpressions:
????????????-?key:?ceph-mgr
??????????????operator:?In
??????????????values:
??????????????-?enabled?
#?存儲的設置,默認都是true,意思是會把集群所有node的設備清空初始化。
??storage:?#?cluster?level?storage?configuration?and?selection
????useAllNodes:?false?????#關閉使用所有Node
????useAllDevices:?false???#關閉使用所有設備
????nodes:
????-?name:?"192.168.1.162"??#指定存儲節(jié)點主機
??????devices:
??????-?name:?"nvme0n1p1"????#指定磁盤為nvme0n1p1
????-?name:?"192.168.1.163"
??????devices:
??????-?name:?"nvme0n1p1"
????-?name:?"192.168.1.164"
??????devices:
??????-?name:?"nvme0n1p1"
????-?name:?"192.168.1.213"
??????devices:
??????-?name:?"nvme0n1p1"
更多 cluster 的 CRD 配置參考:
https://github.com/rook/rook/blob/master/Documentation/ceph-cluster-crd.md
執(zhí)行安裝
kubectl?apply?-f?cluster.yaml
#?需要等一段時間,所有pod都已正常啟動
[foxchan@k8s-master?ceph]$?kubectl?get?pods?-n?rook-ceph?
NAME??????????????????????????????????????????????????????READY???STATUS??????RESTARTS???AGE
csi-cephfsplugin-b5tlr????????????????????????????????????3/3?????Running?????0??????????19h
csi-cephfsplugin-mjssm????????????????????????????????????3/3?????Running?????0??????????19h
csi-cephfsplugin-provisioner-5cf5ffdc76-mhdgz?????????????6/6?????Running?????0??????????19h
csi-cephfsplugin-provisioner-5cf5ffdc76-rpdl8?????????????6/6?????Running?????0??????????19h
csi-cephfsplugin-qmvkc????????????????????????????????????3/3?????Running?????0??????????19h
csi-cephfsplugin-tntzd????????????????????????????????????3/3?????Running?????0??????????19h
csi-rbdplugin-4p75p???????????????????????????????????????3/3?????Running?????0??????????19h
csi-rbdplugin-89mzz???????????????????????????????????????3/3?????Running?????0??????????19h
csi-rbdplugin-cjcwr???????????????????????????????????????3/3?????Running?????0??????????19h
csi-rbdplugin-ndjcj???????????????????????????????????????3/3?????Running?????0??????????19h
csi-rbdplugin-provisioner-658dd9fbc5-fwkmc????????????????6/6?????Running?????0??????????19h
csi-rbdplugin-provisioner-658dd9fbc5-tlxd8????????????????6/6?????Running?????0??????????19h
prometheus-rook-prometheus-0??????????????????????????????2/2?????Running?????1??????????3d17h
rook-ceph-mds-myfs-a-5cbcdc6f9c-7mdsv?????????????????????1/1?????Running?????0??????????19h
rook-ceph-mds-myfs-b-5f4cc54b87-m6m6f?????????????????????1/1?????Running?????0??????????19h
rook-ceph-mgr-a-f98d4455b-bwhw7???????????????????????????1/1?????Running?????0??????????20h
rook-ceph-mon-a-5d445d4b8d-lmg67??????????????????????????1/1?????Running?????1??????????20h
rook-ceph-mon-b-769c6fd76f-jrlc8??????????????????????????1/1?????Running?????0??????????20h
rook-ceph-mon-c-6bfd8954f5-tbsnd??????????????????????????1/1?????Running?????0??????????20h
rook-ceph-operator-7d8cc65dc-8wtl8????????????????????????1/1?????Running?????0??????????20h
rook-ceph-osd-0-c558ff759-bzbgw???????????????????????????1/1?????Running?????0??????????20h
rook-ceph-osd-1-5c97d69d78-dkxbb??????????????????????????1/1?????Running?????0??????????20h
rook-ceph-osd-2-7dddc7fd56-p58mw??????????????????????????1/1?????Running?????0??????????20h
rook-ceph-osd-3-65ff985c7d-9gfgj??????????????????????????1/1?????Running?????0??????????20h
rook-ceph-osd-prepare-192.168.1.213-pw5gr?????????????????0/1?????Completed???0??????????19h
rook-ceph-osd-prepare-192.168.1.162-wtkm8?????????????????0/1?????Completed???0??????????19h
rook-ceph-osd-prepare-192.168.1.163-b86r2?????????????????0/1?????Completed???0??????????19h
rook-ceph-osd-prepare-192.168.1.164-tj79t?????????????????0/1?????Completed???0??????????19h
rook-discover-89v49???????????????????????????????????????1/1?????Running?????0??????????20h
rook-discover-jdzhn???????????????????????????????????????1/1?????Running?????0??????????20h
rook-discover-sl9bv???????????????????????????????????????1/1?????Running?????0??????????20h
rook-discover-wg25w???????????????????????????????????????1/1?????Running?????0??????????20h
2.4 增刪osd
2.4.1 添加相關label
kubectl?label?nodes?192.168.1.165?app.rook=storage
kubectl?label?nodes?192.168.1.165?ceph-osd=enabled
2.4.2 修改cluster.yaml
????nodes:
????-?name:?"192.168.1.162"
??????devices:
??????-?name:?"nvme0n1p1"?
????-?name:?"192.168.1.163"
??????devices:
??????-?name:?"nvme0n1p1"
????-?name:?"192.168.1.164"
??????devices:
??????-?name:?"nvme0n1p1"
????-?name:?"192.168.17.213"
??????devices:
??????-?name:?"nvme0n1p1"
??#添加165的磁盤信息?
????-?name:?"192.168.1.165"
??????devices:
??????-?name:?"nvme0n1p1"????
2.4.3 apply cluster.yaml
kubectl?apply?-f?cluster.yaml
2.4.4 刪除osd
cluster.yaml去掉相關節(jié)點,再apply
2.5 安裝dashboard
這是我自己的traefik ingress,yaml目錄里有很多dashboard暴露方式,自行選擇
dashboard已經在前述的步驟中包含了,這里只需要把dashboard service的服務暴露出來。有多種方法,我使用的是ingress的方式來暴露:
apiVersion:?traefik.containo.us/v1alpha1
kind:?Ingre***oute
metadata:
??name:?traefik-ceph-dashboard
??annotations:
????kubernetes.io/ingress.class:?traefik-v2.3
spec:
??entryPoints:
????-?web
??routes:
??-?match:?Host(`ceph.foxchan.com`)
????kind:?Rule
????services:
????-?name:?rook-ceph-mgr-dashboard
??????namespace:?rook-ceph
??????port:?7000
????middlewares:
??????-?name:?gs-ipwhitelist
登錄 dashboard 需要安全訪問。Rook 在運行 Rook Ceph 集群的名稱空間中創(chuàng)建一個默認用戶,admin 并生成一個稱為的秘密rook-ceph-dashboard-admin-password
要檢索生成的密碼,可以運行以下命令:
kubectl?-n?rook-ceph?get?secret?rook-ceph-dashboard-password?-o?jsonpath="{['data']['password']}"?|?base64?--decode?&&?echo
2.6 安裝toolbox
執(zhí)行下面的命令:
kubectl?apply?-f?toolbox.yaml
成功后,可以使用下面的命令來確定toolbox的pod已經啟動成功:
kubectl?-n?rook-ceph?get?pod?-l?"app=rook-ceph-tools"
然后可以使用下面的命令登錄該pod,執(zhí)行各種ceph命令:
kubectl?-n?rook-ceph?exec?-it?deploy/rook-ceph-tools?--?bash
比如:
ceph statusceph osd statusceph dfrados df
刪除toolbox
kubectl?-n?rook-ceph?delete?deploy/rook-ceph-tools
2.7 prometheus監(jiān)控
監(jiān)控部署很簡單,利用Prometheus Operator,獨立部署一套prometheus
安裝prometheus operator
kubectl?apply?-f?https://raw.githubusercontent.com/coreos/prometheus-operator/v0.40.0/bundle.yaml
安裝prometheus
git?clone?--single-branch?--branch?v1.5.1?https://github.com/rook/rook.git
cd?rook/cluster/examples/kubernetes/ceph/monitoring
kubectl?create?-f?service-monitor.yaml
kubectl?create?-f?prometheus.yaml
kubectl?create?-f?prometheus-service.yaml
默認是nodeport方式暴露
echo?"http://$(kubectl?-n?rook-ceph?-o?jsonpath={.status.hostIP}?get?pod?prometheus-rook-prometheus-0):30900"
開啟Prometheus Alerts
此操作必須在ceph集群安裝之前
安裝rbac
kubectl?create?-f?cluster/examples/kubernetes/ceph/monitoring/rbac.yaml
確保cluster.yaml 開啟
apiVersion:?ceph.rook.io/v1
kind:?CephCluster
metadata:
??name:?rook-ceph
??namespace:?rook-ceph
[...]
spec:
[...]
??monitoring:
????enabled:?true
????rulesNamespace:?"rook-ceph"
[...]
Grafana Dashboards
Grafana 版本大于等于 7.2.0
推薦一下dashboard
Ceph - Cluster Ceph - OSD (Single) Ceph - Pools
2.8 刪除ceph集群
刪除ceph集群前,請先清理相關pod
刪除塊存儲和文件存儲
kubectl?delete?-n?rook-ceph?cephblockpool?replicapool
kubectl?delete?storageclass?rook-ceph-block
kubectl?delete?-f?csi/cephfs/filesystem.yaml
kubectl?delete?storageclass?csi-cephfs?rook-ceph-block
刪除operator和相關crd
kubectl?delete?-f?operator.yaml
kubectl?delete?-f?common.yaml
kubectl?delete?-f?crds.yaml
清除主機上的數(shù)據(jù)
刪除Ceph集群后,在之前部署Ceph組件節(jié)點的/data/rook/目錄,會遺留下Ceph集群的配置信息。
若之后再部署新的Ceph集群,先把之前Ceph集群的這些信息刪除,不然啟動monitor會失?。?/p>
#?cat?clean-rook-dir.sh
hosts=(
??192.168.1.213
??192.168.1.162
??192.168.1.163
??192.168.1.164
)
for?host?in?${hosts[@]}?;?do
??ssh?$host?"rm?-rf?/data/rook/*"
done
清除device
#!/usr/bin/env?bash
DISK="/dev/nvme0n1p1"
#?Zap?the?disk?to?a?fresh,?usable?state?(zap-all?is?important,?b/c?MBR?has?to?be?clean)
#?You?will?have?to?run?this?step?for?all?disks.
sgdisk?--zap-all?$DISK
#?hdd?用以下命令
dd?if=/dev/zero?of="$DISK"?bs=1M?count=100?oflag=direct,dsync
#?ssd?用以下命令
blkdiscard?$DISK
#?These?steps?only?have?to?be?run?once?on?each?node
#?If?rook?sets?up?osds?using?ceph-volume,?teardown?leaves?some?devices?mapped?that?lock?the?disks.
ls?/dev/mapper/ceph-*?|?xargs?-I%?--?dmsetup?remove?%
#?ceph-volume?setup?can?leave?ceph-?directories?in?/dev?(unnecessary?clutter)
rm?-rf?/dev/ceph-*
如果因為某些原因導致刪除ceph集群卡主,可以先執(zhí)行以下命令, 再刪除ceph集群就不會卡主了
kubectl?-n?rook-ceph?patch?cephclusters.ceph.rook.io?rook-ceph?-p?'{"metadata":{"finalizers":?[]}}'?--type=merge
2.9 rook升級
2.9.1 小版本升級
Rook v1.5.0 to Rook v1.5.1
git?clone?--single-branch?--branch?v1.5.1?https://github.com/rook/rook.gits
cd?$YOUR_ROOK_REPO/cluster/examples/kubernetes/ceph/
kubectl?apply?-f?common.yaml?-f?crds.yaml
kubectl?-n?rook-ceph?set?image?deploy/rook-ceph-operator?rook-ceph-operator=rook/ceph:v1.5.1
2.9.2 跨版本升級
Rook v1.4.x to Rook v1.5.x.
準備
設置環(huán)境變量
#?Parameterize?the?environment
export?ROOK_SYSTEM_NAMESPACE="rook-ceph"
export?ROOK_NAMESPACE="rook-ceph"
升級之前需要保證集群健康
所有pod 是running
kubectl?-n?$ROOK_NAMESPACE?get?pods
通過tool 查看ceph集群狀態(tài)是否正常
TOOLS_POD=$(kubectl?-n?$ROOK_NAMESPACE?get?pod?-l?"app=rook-ceph-tools"?-o?jsonpath='{.items[0].metadata.name}')
kubectl?-n?$ROOK_NAMESPACE?exec?-it?$TOOLS_POD?--?ceph?status
??cluster:
????id:?????194d139f-17e7-4e9c-889d-2426a844c91b
????health:?HEALTH_OK
??services:
????mon:?3?daemons,?quorum?a,b,c?(age?25h)
????mgr:?a(active,?since?5h)
????mds:?myfs:1?{0=myfs-b=up:active}?1?up:standby-replay
????osd:?4?osds:?4?up?(since?25h),?4?in?(since?25h)
??task?status:
????scrub?status:
????????mds.myfs-a:?idle
????????mds.myfs-b:?idle
??data:
????pools:???4?pools,?97?pgs
????objects:?2.08k?objects,?7.6?GiB
????usage:???26?GiB?used,?3.3?TiB?/?3.3?TiB?avail
????pgs:?????97?active+clean
??io:
????client:???1.2?KiB/s?rd,?2?op/s?rd,?0?op/s?wr
升級operator
1、 升級common和crd
git?clone?--single-branch?--branch?v1.5.1?https://github.com/rook/rook.gits
cd?rook/cluster/examples/kubernetes/ceph
kubectl?apply?-f?common.yaml?-f?crds.yaml
2、升級 Ceph CSI versions
可以修改cm來自己制定鏡像版本,如果是默認的配置,無需修改
kubectl?-n?rook-ceph?get?configmap?rook-ceph-operator-config
ROOK_CSI_CEPH_IMAGE:?"harbor.foxchan.com/google_containers/cephcsi/cephcsi:v3.1.1"
ROOK_CSI_REGISTRAR_IMAGE:?"harbor.foxchan.com/google_containers/k8scsi/csi-node-driver-registrar:v2.0.1"
ROOK_CSI_PROVISIONER_IMAGE:?"harbor.foxchan.com/google_containers/k8scsi/csi-provisioner:v2.0.0"
ROOK_CSI_SNAPSHOTTER_IMAGE:?"harbor.foxchan.com/google_containers/k8scsi/csi-snapshotter:v3.0.0"
ROOK_CSI_ATTACHER_IMAGE:?"harbor.foxchan.com/google_containers/k8scsi/csi-attacher:v3.0.0"
ROOK_CSI_RESIZER_IMAGE:?"harbor.foxchan.com/google_containers/k8scsi/csi-resizer:v1.0.0"
3、升級 Rook Operator
kubectl?-n?$ROOK_SYSTEM_NAMESPACE?set?image?deploy/rook-ceph-operator?rook-ceph-operator=rook/ceph:v1.5.1
4、等待集群 升級完畢
watch?--exec?kubectl?-n?$ROOK_NAMESPACE?get?deployments?-l?rook_cluster=$ROOK_NAMESPACE?-o?jsonpath='{range?.items[*]}{.metadata.name}{"??\treq/upd/avl:?"}{.spec.replicas}{"/"}{.status.updatedReplicas}{"/"}{.status.readyReplicas}{"??\trook-version="}{.metadata.labels.rook-version}{"\n"}{end}'
5、驗證集群升級完畢
kubectl?-n?$ROOK_NAMESPACE?get?deployment?-l?rook_cluster=$ROOK_NAMESPACE?-o?jsonpath='{range?.items[*]}{"rook-version="}{.metadata.labels.rook-version}{"\n"}{end}'?|?sort?|?uniq
升級ceph 版本
如果集群狀態(tài)不監(jiān)控,operator會拒絕升級
1、升級ceph鏡像
NEW_CEPH_IMAGE='ceph/ceph:v15.2.5'
CLUSTER_NAME=rook-ceph??
kubectl?-n?rook-ceph?patch?CephCluster?rook-ceph?--type=merge?-p?"{\"spec\":?{\"cephVersion\":?{\"image\":?\"$NEW_CEPH_IMAGE\"}}}"
2、觀察pod 升級
watch?--exec?kubectl?-n?$ROOK_NAMESPACE?get?deployments?-l?rook_cluster=$ROOK_NAMESPACE?-o?jsonpath='{range?.items[*]}{.metadata.name}{"??\treq/upd/avl:?"}{.spec.replicas}{"/"}{.status.updatedReplicas}{"/"}{.status.readyReplicas}{"??\tceph-version="}{.metadata.labels.ceph-version}{"\n"}{end}'
3、查看ceph集群是否正常
kubectl?-n?$ROOK_NAMESPACE?get?deployment?-l?rook_cluster=$ROOK_NAMESPACE?-o?jsonpath='{range?.items[*]}{"ceph-version="}{.metadata.labels.ceph-version}{"\n"}{end}'?|?sort?|?uniq
三、部署塊存儲
3.1 創(chuàng)建pool和StorageClass
#?定義一個塊存儲池
apiVersion:?ceph.rook.io/v1
kind:?CephBlockPool
metadata:
??name:?replicapool
??namespace:?rook-ceph
spec:
??#?每個數(shù)據(jù)副本必須跨越不同的故障域分布,如果設置為host,則保證每個副本在不同機器上
??failureDomain:?host
??#?副本數(shù)量
??replicated:
????size:?3
????#?Disallow?setting?pool?with?replica?1,?this?could?lead?to?data?loss?without?recovery.
????#?Make?sure?you're?*ABSOLUTELY?CERTAIN*?that?is?what?you?want
????requireSafeReplicaSize:?true
????#?gives?a?hint?(%)?to?Ceph?in?terms?of?expected?consumption?of?the?total?cluster?capacity?of?a?given?pool
????#?for?more?info:?https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size
????#targetSizeRatio:?.5
---
#?定義一個StorageClass
apiVersion:?storage.k8s.io/v1
kind:?StorageClass
metadata:
???name:?rook-ceph-block
#?該SC的Provisioner標識,rook-ceph前綴即當前命名空間
provisioner:?rook-ceph.rbd.csi.ceph.com
parameters:
????#?clusterID?就是集群所在的命名空間名
????#?If?you?change?this?namespace,?also?change?the?namespace?below?where?the?secret?namespaces?are?defined
????clusterID:?rook-ceph
????#?If?you?want?to?use?erasure?coded?pool?with?RBD,?you?need?to?create
????#?two?pools.?one?erasure?coded?and?one?replicated.
????#?You?need?to?specify?the?replicated?pool?here?in?the?`pool`?parameter,?it?is
????#?used?for?the?metadata?of?the?images.
????#?The?erasure?coded?pool?must?be?set?as?the?`dataPool`?parameter?below.
????#dataPool:?ec-data-pool
????#?RBD鏡像在哪個池中創(chuàng)建
????pool:?replicapool
????#?RBD?image?format.?Defaults?to?"2".
????imageFormat:?"2"
????#?指定image特性,CSI?RBD目前僅僅支持layering
????imageFeatures:?layering
????#?Ceph?admin?管理憑證配置,由operator?自動生成
????#?in?the?same?namespace?as?the?cluster.
????csi.storage.k8s.io/provisioner-secret-name:?rook-csi-rbd-provisioner
????csi.storage.k8s.io/provisioner-secret-namespace:?rook-ceph
????csi.storage.k8s.io/controller-expand-secret-name:?rook-csi-rbd-provisioner
????csi.storage.k8s.io/controller-expand-secret-namespace:?rook-ceph
????csi.storage.k8s.io/node-stage-secret-name:?rook-csi-rbd-node
????csi.storage.k8s.io/node-stage-secret-namespace:?rook-ceph
????#?卷的文件系統(tǒng)類型,默認ext4,不建議xfs,因為存在潛在的死鎖問題(超融合設置下卷被掛載到相同節(jié)點作為OSD時)
????csi.storage.k8s.io/fstype:?ext4
#?uncomment?the?following?to?use?rbd-nbd?as?mounter?on?supported?nodes
#?**IMPORTANT**:?If?you?are?using?rbd-nbd?as?the?mounter,?during?upgrade?you?will?be?hit?a?ceph-csi
#?issue?that?causes?the?mount?to?be?disconnected.?You?will?need?to?follow?special?upgrade?steps
#?to?restart?your?application?pods.?Therefore,?this?option?is?not?recommended.
#mounter:?rbd-nbd
allowVolumeExpansion:?true
reclaimPolicy:?Delete
3.2 demo示例
推薦pvc 和應用寫到一個yaml里面
#創(chuàng)建pvc
apiVersion:?v1
kind:?PersistentVolumeClaim
metadata:
??name:?rbd-demo-pvc
spec:
??accessModes:
??-?ReadWriteOnce
??resources:
????requests:
??????storage:?1Gi
??storageClassName:?rook-ceph-block
---
apiVersion:?apps/v1
kind:?Deployment
metadata:
??name:?csirbd-demo-pod
??labels:
????test-cephrbd:?"true"
spec:
??replicas:?1
??selector:
????matchLabels:
??????test-cephrbd:?"true"
??template:
????metadata:
??????labels:
????????test-cephrbd:?"true"
????spec:
??????containers:
???????-?name:?web-server-rbd
?????????image:?harbor.foxchan.com/sys/nginx:1.19.4-alpine
?????????volumeMounts:
???????????-?name:?mypvc
?????????????mountPath:?/usr/share/nginx/html
??????volumes:
???????-?name:?mypvc
?????????persistentVolumeClaim:
???????????claimName:?rbd-demo-pvc
???????????readOnly:?false
四、部署文件系統(tǒng)
4.1 創(chuàng)建CephFS
CephFS的CSI驅動使用Quotas來強制應用PVC聲明的大小,僅僅4.17+內核才能支持CephFS quotas。
如果內核不支持,而且你需要配額管理,配置Operator環(huán)境變量 CSI_FORCE_CEPHFS_KERNEL_CLIENT: false來啟用FUSE客戶端。
使用FUSE客戶端時,升級Ceph集群時應用Pod會斷開mount,需要重啟才能再次使用PV。
apiVersion:?ceph.rook.io/v1
kind:?CephFilesystem
metadata:
??name:?myfs
??namespace:?rook-ceph
spec:
??#?The?metadata?pool?spec.?Must?use?replication.
??metadataPool:
????replicated:
??????size:?3
??????requireSafeReplicaSize:?true
????parameters:
??????#?Inline?compression?mode?for?the?data?pool
??????#?Further?reference:?https://docs.ceph.com/docs/nautilus/rados/configuration/bluestore-config-ref/#inline-compression
??????compression_mode:?none
????????#?gives?a?hint?(%)?to?Ceph?in?terms?of?expected?consumption?of?the?total?cluster?capacity?of?a?given?pool
??????#?for?more?info:?https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size
??????#target_size_ratio:?".5"
??#?The?list?of?data?pool?specs.?Can?use?replication?or?erasure?coding.
??dataPools:
????-?failureDomain:?host
??????replicated:
????????size:?3
????????#?Disallow?setting?pool?with?replica?1,?this?could?lead?to?data?loss?without?recovery.
????????#?Make?sure?you're?*ABSOLUTELY?CERTAIN*?that?is?what?you?want
????????requireSafeReplicaSize:?true
??????parameters:
????????#?Inline?compression?mode?for?the?data?pool
????????#?Further?reference:?https://docs.ceph.com/docs/nautilus/rados/configuration/bluestore-config-ref/#inline-compression
????????compression_mode:?none
??????????#?gives?a?hint?(%)?to?Ceph?in?terms?of?expected?consumption?of?the?total?cluster?capacity?of?a?given?pool
????????#?for?more?info:?https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size
????????#target_size_ratio:?".5"
??#?Whether?to?preserve?filesystem?after?CephFilesystem?CRD?deletion
??preserveFilesystemOnDelete:?true
??#?The?metadata?service?(mds)?configuration
??metadataServer:
????#?The?number?of?active?MDS?instances
????activeCount:?1
????#?Whether?each?active?MDS?instance?will?have?an?active?standby?with?a?warm?metadata?cache?for?faster?failover.
????#?If?false,?standbys?will?be?available,?but?will?not?have?a?warm?cache.
????activeStandby:?true
????#?The?affinity?rules?to?apply?to?the?mds?deployment
????placement:
??????nodeAffinity:
????????requiredDuringSchedulingIgnoredDuringExecution:
??????????nodeSelectorTerms:
??????????-?matchExpressions:
????????????-?key:?app.storage
??????????????operator:?In
??????????????values:
??????????????-?rook-ceph
????#??topologySpreadConstraints:
????#??tolerations:
????#??-?key:?mds-node
????#????operator:?Exists
????#??podAffinity:
??????podAntiAffinity:
?????????requiredDuringSchedulingIgnoredDuringExecution:
?????????-?labelSelector:
?????????????matchExpressions:
?????????????-?key:?ceph-mds
???????????????operator:?In
???????????????values:
???????????????-?enabled
????????????#?topologyKey:?kubernetes.io/hostname?will?place?MDS?across?different?hosts
???????????topologyKey:?kubernetes.io/hostname
?????????preferredDuringSchedulingIgnoredDuringExecution:
?????????-?weight:?100
???????????podAffinityTerm:
?????????????labelSelector:
???????????????matchExpressions:
???????????????-?key:?ceph-mds
?????????????????operator:?In
?????????????????values:
??????????????????-?enabled
??????????????#?topologyKey:?*/zone?can?be?used?to?spread?MDS?across?different?AZ
??????????????#?Use??in?k8s?cluster?if?your?cluster?is?v1.16?or?lower
??????????????#?Use???in?k8s?cluster?is?v1.17?or?upper
?????????????topologyKey:?topology.kubernetes.io/zone
????#?A?key/value?list?of?annotations
????annotations:
????#??key:?value
????#?A?key/value?list?of?labels
????labels:
????#??key:?value
????resources:
????#?The?requests?and?limits?set?here,?allow?the?filesystem?MDS?Pod(s)?to?use?half?of?one?CPU?core?and?1?gigabyte?of?memory
????#??limits:
????#????cpu:?"500m"
????#????memory:?"1024Mi"
????#??requests:
????#????cpu:?"500m"
????#????memory:?"1024Mi"
????#?priorityClassName:?my-priority-class
4.2 創(chuàng)建StorageClass
apiVersion:?storage.k8s.io/v1
kind:?StorageClass
metadata:
??name:?rook-cephfs
provisioner:?rook-ceph.cephfs.csi.ceph.com
parameters:
??#?clusterID?is?the?namespace?where?operator?is?deployed.
??clusterID:?rook-ceph
??#?CephFS?filesystem?name?into?which?the?volume?shall?be?created
??fsName:?myfs
??#?Ceph?pool?into?which?the?volume?shall?be?created
??#?Required?for?provisionVolume:?"true"
??pool:?myfs-data0
??#?Root?path?of?an?existing?CephFS?volume
??#?Required?for?provisionVolume:?"false"
??#?rootPath:?/absolute/path
??#?The?secrets?contain?Ceph?admin?credentials.?These?are?generated?automatically?by?the?operator
??#?in?the?same?namespace?as?the?cluster.
??csi.storage.k8s.io/provisioner-secret-name:?rook-csi-cephfs-provisioner
??csi.storage.k8s.io/provisioner-secret-namespace:?rook-ceph
??csi.storage.k8s.io/controller-expand-secret-name:?rook-csi-cephfs-provisioner
??csi.storage.k8s.io/controller-expand-secret-namespace:?rook-ceph
??csi.storage.k8s.io/node-stage-secret-name:?rook-csi-cephfs-node
??csi.storage.k8s.io/node-stage-secret-namespace:?rook-ceph
??#?(optional)?The?driver?can?use?either?ceph-fuse?(fuse)?or?ceph?kernel?client?(kernel)
??#?If?omitted,?default?volume?mounter?will?be?used?-?this?is?determined?by?probing?for?ceph-fuse
??#?or?by?setting?the?default?mounter?explicitly?via?--volumemounter?command-line?argument.
??#使用kernel?client
??mounter:?kernel
reclaimPolicy:?Delete
allowVolumeExpansion:?true
mountOptions:
??#?uncomment?the?following?line?for?debugging
??#-?debug
4.3 創(chuàng)建pvc
在創(chuàng)建cephfs 的pvc 發(fā)現(xiàn)一直處于pending狀態(tài),社區(qū)有人認為是網絡組件的差異,目前我的calico無法成功,只能改為host模式,flannel可以。
apiVersion:?v1
kind:?PersistentVolumeClaim
metadata:
??name:?cephfs-pvc
spec:
??accessModes:
??-?ReadWriteMany
??resources:
????requests:
??????storage:?1Gi
??storageClassName:?rook-cephfs
4.4 demo示例
apiVersion:?v1
kind:?PersistentVolumeClaim
metadata:
??name:?cephfs-demo-pvc
spec:
??accessModes:
??-?ReadWriteMany
??resources:
????requests:
??????storage:?1Gi
??storageClassName:?rook-cephfs
---
apiVersion:?apps/v1
kind:?Deployment
metadata:
??name:?csicephfs-demo-pod
??labels:
????test-cephfs:?"true"
spec:
??replicas:?2
??selector:
????matchLabels:
??????test-cephfs:?"true"
??template:
????metadata:
??????labels:
????????test-cephfs:?"true"
????spec:
??????containers:
??????-?name:?web-server
????????image:?harbor.foxchan.com/sys/nginx:1.19.4-alpine
????????imagePullPolicy:?Always
????????volumeMounts:
????????-?name:?mypvc
??????????mountPath:?/usr/share/nginx/html
??????volumes:
??????-?name:?mypvc
????????persistentVolumeClaim:
??????????claimName:?cephfs-demo-pvc
??????????readOnly:?false
五、遇到問題
5.1 lvm direct 不能直接做osd 存儲
官方issue:https://github.com/rook/rook/issues/5751
解決方式:可以手動創(chuàng)建本地pvc,把lvm掛載上在做osd設備。如果手動閑麻煩,可以使用 local-path-provisioner
5.2 Cephfs pvc pending
官方issue:https://github.com/rook/rook/issues/6183
解決方式:更換 k8s 網絡組件,或者把 ceph 集群網絡開啟 host
原文鏈接:https://blog.51cto.com/foxhound/2553979
進階訓練營第二期
本次訓練營采用線上直播的形式,基于1.19.x版本,根據(jù)第1期課程的打磨,我們總結出了 Docker 基礎 + Kubernetes 基礎 + 原理 + 基本使用 + 進階技能 +?完整項目實踐?的課程體系。加強系統(tǒng)知識吸收、夯實基礎的同時,并在實際操作過程中去了解排查問題的方式方法,更為重要的是我們的老師非常負責任,隨時幫你答疑解惑,我們認為不只是課堂上講授知識,更重要的是售后支持,完全不用擔心學習不到知識。
?點擊屏末?|?閱讀原文?|?即刻學習
