開發(fā)一個簡單的 etcd operator
前面我們了解了 etcd 的集群搭建模式,也了解了如何在 Kubernetes 集群中來部署 etcd 集群,要開發(fā)一個對應的 Operator 其實也就是讓我們用代碼去實現(xiàn) etcd 的這一系列的運維工作而已,說白了就是把 StatefulSet 中的啟動腳本翻譯成我們的 golang 代碼。這里我們分成不同的版本來漸進式開發(fā),首先第一個版本我們開發(fā)一個最簡單的 Operator,直接用我們的 Operator 去生成前面的 StatefulSet 模板即可。
項目初始化
同樣在開發(fā) Operator 之前我們需要先提前想好我們的 CRD 資源對象,比如我們想要通過下面的 CR 資源來創(chuàng)建對應的 etcd 集群:
apiVersion:?etcd.ydzs.io/v1alpha1
kind:?EtcdCluster
metadata:
??name:?demo
spec:
?size:?3??#?副本數(shù)量
?image:?cnych/etcd:v3.4.13??#?鏡像
因為其他信息都是通過腳本獲取的,所以基本上我們通過 size 和 image 兩個字段就可以確定一個 Etcd 集群部署的樣子了,所以我們的第一個版本非常簡單,只要能夠?qū)懗稣_的部署腳本即可,然后我們在 Operator 當中根據(jù)上面我們定義的 EtcdCluster 這個 CR 資源來組裝一個 StatefulSet 和 Headless SVC 對象就可以了。
首先初始化項目,這里我們使用 kubebuilder 來構建我們的腳手架:
???kubebuilder?init?--domain?ydzs.io?--owner?cnych?--repo?github.com/cnych/etcd-operator
Writing?scaffold?for?you?to?edit...
Get?controller?runtime:
$?go?get?sigs.k8s.io/[email protected]
Update?go.mod:
$?go?mod?tidy
Running?make:
$?make
/Users/ych/devs/projects/go/bin/controller-gen?object:headerFile="hack/boilerplate.go.txt"?paths="./..."
go?fmt?./...
go?vet?./...
go?build?-o?bin/manager?main.go
Next:?define?a?resource?with:
$?kubebuilder?create?api
項目腳手架創(chuàng)建完成后,然后定義資源 API:
???kubebuilder?create?api?--group?etcd?--version?v1alpha1?--kind?EtcdCluster
Create?Resource?[y/n]
y
Create?Controller?[y/n]
y
Writing?scaffold?for?you?to?edit...
api/v1alpha1/etcdcluster_types.go
controllers/etcdcluster_controller.go
Running?make:
$?make
/Users/ych/devs/projects/go/bin/controller-gen?object:headerFile="hack/boilerplate.go.txt"?paths="./..."
go?fmt?./...
go?vet?./...
go?build?-o?bin/manager?main.go
這樣我們的項目就初始化完成了,整體的代碼結構如下所示:
???etcd-operator?tree?-L?2
.
├──?Dockerfile
├──?Makefile
├──?PROJECT
├──?api
│???└──?v1alpha1
├──?bin
│???└──?manager
├──?config
│???├──?certmanager
│???├──?crd
│???├──?default
│???├──?manager
│???├──?prometheus
│???├──?rbac
│???├──?samples
│???└──?webhook
├──?controllers
│???├──?etcdcluster_controller.go
│???└──?suite_test.go
├──?go.mod
├──?go.sum
├──?hack
│???└──?boilerplate.go.txt
└──?main.go
14?directories,?10?files
然后根據(jù)我們上面設計的 EtcdCluster 這個對象來編輯 Operator 的結構體即可,修改文件 api/v1alpha1/etcdcluster_types.go 中的 EtcdClusterSpec 結構體:
//?api/v1alpha1/etcdcluster_types.go
//?EtcdClusterSpec?defines?the?desired?state?of?EtcdCluster
type?EtcdClusterSpec?struct?{
?//?INSERT?ADDITIONAL?SPEC?FIELDS?-?desired?state?of?cluster
?//?Important:?Run?"make"?to?regenerate?code?after?modifying?this?file
?Size??uint???`json:"size"`
?Image?string?`json:"image"`
}
要注意每次修改完成后需要執(zhí)行 make 命令重新生成代碼:
???make
/Users/ych/devs/projects/go/bin/controller-gen?object:headerFile="hack/boilerplate.go.txt"?paths="./..."
go?fmt?./...
go?vet?./...
go?build?-o?bin/manager?main.go
接下來我們就可以去控制器的 Reconcile 函數(shù)中來實現(xiàn)我們自己的業(yè)務邏輯了。
業(yè)務邏輯
首先在目錄 controllers 下面創(chuàng)建一個 resource.go 文件,用來根據(jù)我們定義的 EtcdCluster 對象生成對應的 StatefulSet 和 Headless SVC 對象。
//?controllers/resource.go
package?controllers
import?(
?"strconv"
?"github.com/cnych/etcd-operator/api/v1alpha1"
?appsv1?"k8s.io/api/apps/v1"
?corev1?"k8s.io/api/core/v1"
?"k8s.io/apimachinery/pkg/api/resource"
?metav1?"k8s.io/apimachinery/pkg/apis/meta/v1"
)
var?(
?EtcdClusterLabelKey?=?"etcd.ydzs.io/cluster"
?EtcdClusterCommonLabelKey?=?"app"
?EtcdDataDirName?????=?"datadir"
)
func?MutateStatefulSet(cluster?*v1alpha1.EtcdCluster,?sts?*appsv1.StatefulSet)?{
?sts.Labels?=?map[string]string{
??EtcdClusterCommonLabelKey:?"etcd",
?}
?sts.Spec?=?appsv1.StatefulSetSpec{
??Replicas:????cluster.Spec.Size,
??ServiceName:?cluster.Name,
??Selector:?&metav1.LabelSelector{MatchLabels:?map[string]string{
???EtcdClusterLabelKey:?cluster.Name,
??}},
??Template:?corev1.PodTemplateSpec{
???ObjectMeta:?metav1.ObjectMeta{
????Labels:?map[string]string{
?????EtcdClusterLabelKey:?cluster.Name,
?????EtcdClusterCommonLabelKey:?"etcd",
????},
???},
???Spec:?corev1.PodSpec{
????Containers:?newContainers(cluster),
???},
??},
??VolumeClaimTemplates:?[]corev1.PersistentVolumeClaim{
???corev1.PersistentVolumeClaim{
????ObjectMeta:?metav1.ObjectMeta{
?????Name:?EtcdDataDirName,
????},
????Spec:?corev1.PersistentVolumeClaimSpec{
?????AccessModes:?[]corev1.PersistentVolumeAccessMode{
??????corev1.ReadWriteOnce,
?????},
?????Resources:?corev1.ResourceRequirements{
??????Requests:?corev1.ResourceList{
???????corev1.ResourceStorage:?resource.MustParse("1Gi"),
??????},
?????},
????},
???},
??},
?}
}
func?newContainers(cluster?*v1alpha1.EtcdCluster)?[]corev1.Container?{
?return?[]corev1.Container{
??corev1.Container{
???Name:??"etcd",
???Image:?cluster.Spec.Image,
???Ports:?[]corev1.ContainerPort{
????corev1.ContainerPort{
?????Name:??????????"peer",
?????ContainerPort:?2380,
????},
????corev1.ContainerPort{
?????Name:??????????"client",
?????ContainerPort:?2379,
????},
???},
???Env:?[]corev1.EnvVar{
????corev1.EnvVar{
?????Name:??"INITIAL_CLUSTER_SIZE",
?????Value:?strconv.Itoa(int(*cluster.Spec.Size)),
????},
????corev1.EnvVar{
?????Name:??"SET_NAME",
?????Value:?cluster.Name,
????},
????corev1.EnvVar{
?????Name:?"POD_IP",
?????ValueFrom:?&corev1.EnvVarSource{
??????FieldRef:?&corev1.ObjectFieldSelector{
???????FieldPath:?"status.podIP",
??????},
?????},
????},
????corev1.EnvVar{
?????Name:?"MY_NAMESPACE",
?????ValueFrom:?&corev1.EnvVarSource{
??????FieldRef:?&corev1.ObjectFieldSelector{
???????FieldPath:?"metadata.namespace",
??????},
?????},
????},
???},
???VolumeMounts:?[]corev1.VolumeMount{
????corev1.VolumeMount{
?????Name:??????EtcdDataDirName,
?????MountPath:?"/var/run/etcd",
????},
???},
???Command:?[]string{
????"/bin/sh",?"-ec",
????"HOSTNAME=$(hostname)\n\n??????????????ETCDCTL_API=3\n\n??????????????eps()?{\n??????????????????EPS=\"\"\n??????????????????for?i?in?$(seq?0?$((${INITIAL_CLUSTER_SIZE}?-?1)));?do\n??????????????????????EPS=\"${EPS}${EPS:+,}http://${SET_NAME}-${i}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2379\"\n??????????????????done\n??????????????????echo?${EPS}\n??????????????}\n\n??????????????member_hash()?{\n??????????????????etcdctl?member?list?|?grep?-w?\"$HOSTNAME\"?|?awk?'{?print?$1}'?|?awk?-F?\",\"?'{?print?$1}'\n??????????????}\n\n??????????????initial_peers()?{\n??????????????????PEERS=\"\"\n??????????????????for?i?in?$(seq?0?$((${INITIAL_CLUSTER_SIZE}?-?1)));?do\n????????????????????PEERS=\"${PEERS}${PEERS:+,}${SET_NAME}-${i}=http://${SET_NAME}-${i}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2380\"\n??????????????????done\n??????????????????echo?${PEERS}\n??????????????}\n\n??????????????#?etcd-SET_ID\n??????????????SET_ID=${HOSTNAME##*-}\n\n??????????????#?adding?a?new?member?to?existing?cluster?(assuming?all?initial?pods?are?available)\n??????????????if?[?\"${SET_ID}\"?-ge?${INITIAL_CLUSTER_SIZE}?];?then\n??????????????????#?export?ETCDCTL_ENDPOINTS=$(eps)\n??????????????????#?member?already?added?\n\n??????????????????MEMBER_HASH=$(member_hash)\n??????????????????if?[?-n?\"${MEMBER_HASH}\"?];?then\n??????????????????????#?the?member?hash?exists?but?for?some?reason?etcd?failed\n??????????????????????#?as?the?datadir?has?not?be?created,?we?can?remove?the?member\n??????????????????????#?and?retrieve?new?hash\n??????????????????????echo?\"Remove?member?${MEMBER_HASH}\"\n??????????????????????etcdctl?--endpoints=$(eps)?member?remove?${MEMBER_HASH}\n??????????????????fi\n\n??????????????????echo?\"Adding?new?member\"\n\n??????????????????etcdctl?member?--endpoints=$(eps)?add?${HOSTNAME}?--peer-urls=http://${HOSTNAME}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2380?|?grep?\"^ETCD_\"?>?/var/run/etcd/new_member_envs\n\n??????????????????if?[?$??-ne?0?];?then\n??????????????????????echo?\"member?add?${HOSTNAME}?error.\"\n??????????????????????rm?-f?/var/run/etcd/new_member_envs\n??????????????????????exit?1\n??????????????????fi\n\n??????????????????echo?\"==>?Loading?env?vars?of?existing?cluster...\"\n??????????????????sed?-ie?\"s/^/export?/\"?/var/run/etcd/new_member_envs\n??????????????????cat?/var/run/etcd/new_member_envs\n??????????????????.?/var/run/etcd/new_member_envs\n\n??????????????????exec?etcd?--listen-peer-urls?http://${POD_IP}:2380?\\\n??????????????????????--listen-client-urls?http://${POD_IP}:2379,http://127.0.0.1:2379?\\\n??????????????????????--advertise-client-urls?http://${HOSTNAME}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2379?\\\n??????????????????????--data-dir?/var/run/etcd/default.etcd\n??????????????fi\n\n??????????????for?i?in?$(seq?0?$((${INITIAL_CLUSTER_SIZE}?-?1)));?do\n??????????????????while?true;?do\n??????????????????????echo?\"Waiting?for?${SET_NAME}-${i}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local?to?come?up\"\n??????????????????????ping?-W?1?-c?1?${SET_NAME}-${i}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local?>?/dev/null?&&?break\n??????????????????????sleep?1s\n??????????????????done\n??????????????done\n\n??????????????echo?\"join?member?${HOSTNAME}\"\n??????????????#?join?member\n??????????????exec?etcd?--name?${HOSTNAME}?\\\n??????????????????--initial-advertise-peer-urls?http://${HOSTNAME}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2380?\\\n??????????????????--listen-peer-urls?http://${POD_IP}:2380?\\\n??????????????????--listen-client-urls?http://${POD_IP}:2379,http://127.0.0.1:2379?\\\n??????????????????--advertise-client-urls?http://${HOSTNAME}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2379?\\\n??????????????????--initial-cluster-token?etcd-cluster-1?\\\n??????????????????--data-dir?/var/run/etcd/default.etcd?\\\n??????????????????--initial-cluster?$(initial_peers)?\\\n??????????????????--initial-cluster-state?new",
???},
???Lifecycle:?&corev1.Lifecycle{
????PreStop:?&corev1.Handler{
?????Exec:?&corev1.ExecAction{
??????Command:?[]string{
???????"/bin/sh",?"-ec",
???????"HOSTNAME=$(hostname)\n\n????????????????????member_hash()?{\n????????????????????????etcdctl?member?list?|?grep?-w?\"$HOSTNAME\"?|?awk?'{?print?$1}'?|?awk?-F?\",\"?'{?print?$1}'\n????????????????????}\n\n????????????????????eps()?{\n????????????????????????EPS=\"\"\n????????????????????????for?i?in?$(seq?0?$((${INITIAL_CLUSTER_SIZE}?-?1)));?do\n????????????????????????????EPS=\"${EPS}${EPS:+,}http://${SET_NAME}-${i}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2379\"\n????????????????????????done\n????????????????????????echo?${EPS}\n????????????????????}\n\n????????????????????export?ETCDCTL_ENDPOINTS=$(eps)\n????????????????????SET_ID=${HOSTNAME##*-}\n\n????????????????????#?Removing?member?from?cluster\n????????????????????if?[?\"${SET_ID}\"?-ge?${INITIAL_CLUSTER_SIZE}?];?then\n????????????????????????echo?\"Removing?${HOSTNAME}?from?etcd?cluster\"\n????????????????????????etcdctl?member?remove?$(member_hash)\n????????????????????????if?[?$??-eq?0?];?then\n????????????????????????????#?Remove?everything?otherwise?the?cluster?will?no?longer?scale-up\n????????????????????????????rm?-rf?/var/run/etcd/*\n????????????????????????fi\n????????????????????fi",
??????},
?????},
????},
???},
??},
?}
}
func?MutateHeadlessSvc(cluster?*v1alpha1.EtcdCluster,?svc?*corev1.Service)?{
?svc.Labels?=?map[string]string{
??EtcdClusterCommonLabelKey:?"etcd",
?}
?svc.Spec?=?corev1.ServiceSpec{
??ClusterIP:?corev1.ClusterIPNone,
??Selector:?map[string]string{
???EtcdClusterLabelKey:?cluster.Name,
??},
??Ports:?[]corev1.ServicePort{
???corev1.ServicePort{
????Name:?"peer",
????Port:?2380,
???},
???corev1.ServicePort{
????Name:?"client",
????Port:?2379,
???},
??},
?}
}
上面的代碼雖然很多,但邏輯很簡單,就是根據(jù)我們的 EtcdCluter 去構造 StatefulSet 和 Headless SVC 資源對象,構造完成后,當我們創(chuàng)建 EtcdCluster 的時候就可以在控制器的 Reconcile 函數(shù)中去進行邏輯處理了,這里我們也可以使用前面示例中的代碼來簡單處理即可,代碼如下所示:
//?controllers/etcdcluster_controller.go
func?(r?*EtcdClusterReconciler)?Reconcile(req?ctrl.Request)?(ctrl.Result,?error)?{
?ctx?:=?context.Background()
?log?:=?r.Log.WithValues("etcdcluster",?req.NamespacedName)
?//?首先我們獲取?EtcdCluster?實例
?var?etcdCluster?etcdv1alpha1.EtcdCluster
?if?err?:=?r.Client.Get(ctx,?req.NamespacedName,?&etcdCluster);?err?!=?nil?{
??//?EtcdCluster?was?deleted,Ignore
??return?ctrl.Result{},?client.IgnoreNotFound(err)
?}
?//?得到?EtcdCluster?過后去創(chuàng)建對應的StatefulSet和Service
?//?CreateOrUpdate
?//?(就是觀察的當前狀態(tài)和期望的狀態(tài)進行對比)
?//?調(diào)諧,獲取到當前的一個狀態(tài),然后和我們期望的狀態(tài)進行對比是不是就可以
?//?CreateOrUpdate?Service
?var?svc?corev1.Service
?svc.Name?=?etcdCluster.Name
?svc.Namespace?=?etcdCluster.Namespace
?or,?err?:=?ctrl.CreateOrUpdate(ctx,?r,?&svc,?func()?error?{
??//?調(diào)諧必須在這個函數(shù)中去實現(xiàn)
??MutateHeadlessSvc(&etcdCluster,?&svc)
??return?controllerutil.SetControllerReference(&etcdCluster,?&svc,?r.Scheme)
?})
?if?err?!=?nil?{
??return?ctrl.Result{},?err
?}
?log.Info("CreateOrUpdate",?"Service",?or)
?//?CreateOrUpdate?StatefulSet
?var?sts?appsv1.StatefulSet
?sts.Name?=?etcdCluster.Name
?sts.Namespace?=?etcdCluster.Namespace
?or,?err?=?ctrl.CreateOrUpdate(ctx,?r,?&sts,?func()?error?{
??//?調(diào)諧必須在這個函數(shù)中去實現(xiàn)
??MutateStatefulSet(&etcdCluster,?&sts)
??return?controllerutil.SetControllerReference(&etcdCluster,?&sts,?r.Scheme)
?})
?if?err?!=?nil?{
??return?ctrl.Result{},?err
?}
?log.Info("CreateOrUpdate",?"StatefulSet",?or)
?return?ctrl.Result{},?nil
}
這里我們就是去對我們的 EtcdCluster 對象進行調(diào)諧,然后去創(chuàng)建或者更新對應的 StatefulSet 或者 Headless SVC 對象,邏輯很簡單,這樣我們就實現(xiàn)我們的第一個版本的 etcd-operator。
調(diào)試
接下來我們首先安裝我們的 CRD 對象,讓我們的 Kubernetes 系統(tǒng)識別我們的 EtcdCluster 對象:
???make?install
/Users/ych/devs/projects/go/bin/controller-gen?"crd:trivialVersions=true"?rbac:roleName=manager-role?webhook?paths="./..."?output:crd:artifacts:config=config/crd/bases
kustomize?build?config/crd?|?kubectl?apply?-f?-
customresourcedefinition.apiextensions.k8s.io/etcdclusters.etcd.ydzs.io?configured
然后運行控制器:
???make?run????
/Users/ych/devs/projects/go/bin/controller-gen?object:headerFile="hack/boilerplate.go.txt"?paths="./..."
go?fmt?./...
go?vet?./...
/Users/ych/devs/projects/go/bin/controller-gen?"crd:trivialVersions=true"?rbac:roleName=manager-role?webhook?paths="./..."?output:crd:artifacts:config=config/crd/bases
go?run?./main.go
2020-11-20T17:44:48.222+0800????INFO????controller-runtime.metrics??????metrics?server?is?starting?to?listen????{"addr":?":8080"}
2020-11-20T17:44:48.223+0800????INFO????setup???starting?manager
2020-11-20T17:44:48.223+0800????INFO????controller-runtime.manager??????starting?metrics?server?{"path":?"/metrics"}
2020-11-20T17:44:48.223+0800????INFO????controller-runtime.controller???Starting?EventSource????{"controller":?"etcdcluster",?"source":?"kind?source:?/,?Kind="}
2020-11-20T17:44:48.326+0800????INFO????controller-runtime.controller???Starting?Controller?????{"controller":?"etcdcluster"}
2020-11-20T17:44:48.326+0800????INFO????controller-runtime.controller???Starting?workers????????{"controller":?"etcdcluster",?"worker?count":?1}
控制器啟動成功后我們就可以去創(chuàng)建我們的 Etcd 集群了,將示例 CR 資源清單修改成下面的 YAML:
apiVersion:?etcd.ydzs.io/v1alpha1
kind:?EtcdCluster
metadata:
??name:?etcd-sample
spec:
??size:?3
??image:?cnych/etcd:v3.4.13
另外開啟一個終端創(chuàng)建上面的資源對象:
???kubectl?apply?-f?config/samples/etcd_v1alpha1_etcdcluster.yaml
etcdcluster.etcd.ydzs.io/etcd-sample?created
創(chuàng)建完成后我們可以查看對應的 EtcdCluster 對象:
???kubectl?get?etcdcluster
NAME??????????AGE
etcd-sample???2m35s
對應也會自動創(chuàng)建我們的 StatefulSet 和 Service 資源清單:
???kubectl?get?all?-l?app=etcd
NAME????????????????READY???STATUS????RESTARTS???AGE
pod/etcd-sample-0???1/1?????Running???0??????????85s
pod/etcd-sample-1???1/1?????Running???0??????????71s
pod/etcd-sample-2???1/1?????Running???0??????????66s
NAME??????????????????TYPE????????CLUSTER-IP???EXTERNAL-IP???PORT(S)?????????????AGE
service/etcd-sample???ClusterIP???None?????????????????2380/TCP,2379/TCP???86s
NAME???????????????????????????READY???AGE
statefulset.apps/etcd-sample???3/3?????87s
到這里我們的 Etcd 集群就啟動起來了,我們是不是只通過簡單的幾行代碼就實現(xiàn)了一個 etcd-operator。
當然還有很多細節(jié)沒有處理,比如還沒有添加對 StatefulSet 和 Headless SVC 的 RBAC 權限聲明以及這兩個資源對象變更的 Watch,這個前面我們已經(jīng)講解過了,大家可以試著完善這塊實現(xiàn)。不過這里我們實現(xiàn) etcd operator 的方式比較討巧,我們需要提前去編寫啟動腳本,這個當然不算一個常規(guī)的方式,但是我們知道了如果去啟動 etcd 集群了,后續(xù)也就可以用 golang 代碼去實現(xiàn)了,所以這只是一個一個過程的實現(xiàn)而已~
本文節(jié)選自《Kubernetes 開發(fā)課》課程文檔,該課程正在持續(xù)更新中,對于 Kubernetes 二次開發(fā)感興趣的朋友可以掃描下方二維碼了解課程詳情。
