開(kāi)發(fā)一個(gè) etcd 備份的 operator

前面我們已經(jīng)實(shí)現(xiàn)了一個(gè)簡(jiǎn)單的 etcd operator,要實(shí)現(xiàn) etcd 集群的完整運(yùn)維,備份和恢復(fù)肯定也是必不可少的,本文主要和大家介紹如何編寫(xiě)一個(gè)用于 etcd 備份的 Operator。
首先當(dāng)然需要了解 etcd 的備份邏輯。etcd 的數(shù)據(jù)默認(rèn)會(huì)存放在我們的命令工作目錄中,數(shù)據(jù)所在的目錄會(huì)被分為兩個(gè)文件夾中:
-
snap: 存放快照數(shù)據(jù),etcd 防止 WAL 文件過(guò)多而設(shè)置的快照,存儲(chǔ) etcd 數(shù)據(jù)狀態(tài)。 -
wal: 存放預(yù)寫(xiě)式日志,最大的作用是記錄了整個(gè)數(shù)據(jù)變化的全部歷程,在 etcd 中,所有數(shù)據(jù)的修改在提交前,都要先寫(xiě)入到WAL中。
ETCD 不同的版本的 etcdctl 命令不一樣,但大致差不多,備份我們可以直接使用 snapshot save ,由于 etcd 集群本身就是分布式的,所以每次備份一個(gè)節(jié)點(diǎn)就行。
# 備份命令
$ ETCDCTL_API=3 etcdctl --endpoints=${ENDPOINTS} snapshot save /data/etcd_backup_dir/etcd-snapshot.db
恢復(fù)時(shí)會(huì)覆蓋 snapshot 的元數(shù)據(jù)(member ID 和 cluster ID),使用 snapshot restore 命令指定備份的數(shù)據(jù)目錄即可。
從上面我們可以看出要備份 etcd 集群是很簡(jiǎn)單的,只需要用一條命令指定備份的節(jié)點(diǎn)和備份的數(shù)據(jù)目錄即可,所以如果是我們要編寫(xiě)一個(gè) Operator 來(lái)完成這個(gè)動(dòng)作在 CR 資源里面至少要提供備份的 etcd 節(jié)點(diǎn)地址,以及備份的數(shù)據(jù)存放目錄,對(duì)于備份數(shù)據(jù)這種一般我們用對(duì)象存儲(chǔ)來(lái)保存,比如 S3、OSS 等,這里我們測(cè)試的時(shí)候使用兼容 S3 接口的 minio,但是為了擴(kuò)展我們需要在 CR 資源里面明確告訴控制器我們希望把數(shù)據(jù)備份到什么類(lèi)型的什么路徑上去,比如這里我們提前設(shè)計(jì)一個(gè) CR 資源如下所示:
apiVersion: etcd.ydzs.io/v1alpha1
kind: EtcdBackup
metadata:
name: etcdbackup-sample
spec:
etcdUrl:
# 備份的節(jié)點(diǎn) etcd 地址
storageType: s3 # 指定備份類(lèi)型
s3:
path: "foo-bucket/snapshot.db" # 數(shù)據(jù)存放目錄
s3Secret: "secret" # 包含 accessKeyID 與 secretAccessKey
oss:
path: "foo-bucket/snapshot.db"
ossSecret: "secret"
設(shè)計(jì)好了 CR 資源過(guò)后,接下來(lái)我們只需要去創(chuàng)建這個(gè) API 資源,然后實(shí)現(xiàn)對(duì)應(yīng)的控制器就可以了。
添加接口
同樣直接在項(xiàng)目目錄下面執(zhí)行創(chuàng)建 API 的命令:
$ kubebuilder create api --group etcd --version v1alpha1 --kind EtcdBackup
Create Resource [y/n]
y
Create Controller [y/n]
y
Writing scaffold for you to edit...
api/v1alpha1/etcdbackup_types.go
controllers/etcdbackup_controller.go
Running make:
$ make
/Users/ych/devs/projects/go/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."
go fmt ./...
go vet ./...
go build -o bin/manager main.go
創(chuàng)建完成后,在項(xiàng)目中會(huì)新增 EtcdBackup 相關(guān)的 API 和對(duì)應(yīng)的控制器,我們可以用上面設(shè)計(jì)的 CR 資源覆蓋 samples 目錄中的 EtcdBackup 對(duì)象。
接下來(lái)同樣是根據(jù)我們預(yù)設(shè)計(jì)的 CR 資源去更改 EtcdBackup 的結(jié)構(gòu)體,修改里面的 EtcdBackupSpec 結(jié)構(gòu)體:
// api/v1alpha1/etcdbackup_types.go
type BackupStorageType string
// EtcdBackupSpec defines the desired state of EtcdBackup
type EtcdBackupSpec struct {
// Specific Backup Etcd Endpoints.
EtcdUrl string `json:"etcdUrl"`
// Storage Type:s3 OR oss
StorageType BackupStorageType `json:"storageType"`
// Backup Source
BackupSource `json:",inline"`
}
// BackupSource contains the supported backup sources.
type BackupSource struct {
// S3 defines the S3 backup source spec.
S3 *S3BackupSource `json:"s3,omitempty"`
// OSS defines the OSS backup source spec.
OSS *OSSBackupSource `json:"oss,omitempty"`
}
// S3BackupSource provides the spec how to store backups on S3.
type S3BackupSource struct {
// Path is the full s3 path where the backup is saved.
// The format of the path must be: "
/
"
// e.g: "mybucket/etcd.backup"
Path string `json:"path"`
// The name of the secret object that stores the credential which will be used
// to access S3
//
// The secret must contain the following keys/fields:
// accessKeyID
// accessKeySecret
S3Secret string `json:"s3Secret"`
// Endpoint if blank points to aws. If specified, can point to s3 compatible object
// stores.
Endpoint string `json:"endpoint,omitempty"`
}
// OSSBackupSource provides the spec how to store backups on OSS.
type OSSBackupSource struct {
// Path is the full abs path where the backup is saved.
// The format of the path must be: "
/
"
// e.g: "mybucket/etcd.backup"
Path string `json:"path"`
// The name of the secret object that stores the credential which will be used
// to access Alibaba Cloud OSS.
//
// The secret must contain the following keys/fields:
// accessKeyID
// accessKeySecret
//
// The format of secret:
//
// apiVersion: v1
// kind: Secret
// metadata:
// name:
// type: Opaque
// data:
// accessKeyID:
// accessKeySecret:
//
OSSSecret string `json:"ossSecret"`
// Endpoint is the OSS service endpoint on alibaba cloud, defaults to
// "http://oss-cn-hangzhou.aliyuncs.com".
//
// Details about regions and endpoints, see:
// https://www.alibabacloud.com/help/doc-detail/31837.htm
Endpoint string `json:"endpoint,omitempty"`
}
我們根據(jù)設(shè)計(jì)的 CR 來(lái)修改 EtcdBackup 的結(jié)構(gòu)體,由于我們這里是一個(gè)備份任務(wù),所以我們?cè)谶@個(gè)基礎(chǔ)上增加上 EtcdBackup 狀態(tài),我們需要關(guān)心的備份的操作狀態(tài)、備份開(kāi)始時(shí)間以及備份完成的時(shí)間,所以修改 EtcdBackupStatus 結(jié)構(gòu)體如下所示:
// api/v1alpha1/etcdbackup_types.go
type EtcdBackupPhase string
var (
EtcdBackupPhaseBackingUp EtcdBackupPhase = "BackingUp"
EtcdBackupPhaseCompleted EtcdBackupPhase = "Completed"
EtcdBackupPhaseFailed EtcdBackupPhase = "Failed"
)
// EtcdBackupStatus defines the observed state of EtcdBackup
type EtcdBackupStatus struct {
// Phase defines the current operation that the backup process is taking.
Phase EtcdBackupPhase `json:"phase,omitempty"`
// StartTime is the times that this backup entered the `BackingUp' phase.
// +optional
StartTime *metav1.Time `json:"startTime,omitempty"`
// CompletionTime is the time that this backup entered the `Completed' phase.
// +optional
CompletionTime *metav1.Time `json:"completionTime,omitempty"`
}
API 結(jié)構(gòu)定義完成后,接下來(lái)就是真正的控制器邏輯實(shí)現(xiàn)了。
業(yè)務(wù)邏輯
上面我們定義了 API 結(jié)構(gòu),下面就可以來(lái)實(shí)現(xiàn)控制的業(yè)務(wù)邏輯了??赡艽蠹視?huì)疑惑,我們之前已經(jīng)實(shí)現(xiàn)了一個(gè) EtcdCluster 的控制器,現(xiàn)在又要實(shí)現(xiàn)一個(gè) EtcdBackup 的控制器,那么我們這個(gè) Operator 項(xiàng)目是如何來(lái)管理多個(gè)控制器的呢?
我們可以直接去看下 main.go 文件中的代碼,其中有一段代碼就是分別將這兩個(gè)控制器注冊(cè)到了 Manager 中,如下所示:
// main.go
if err = (&controllers.EtcdClusterReconciler{
Client: mgr.GetClient(),
Log: ctrl.Log.WithName("controllers").WithName("EtcdCluster"),
Scheme: mgr.GetScheme(),
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "EtcdCluster")
os.Exit(1)
}
if err = (&controllers.EtcdBackupReconciler{
Client: mgr.GetClient(),
Log: ctrl.Log.WithName("controllers").WithName("EtcdBackup"),
Scheme: mgr.GetScheme(),
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "EtcdBackup")
os.Exit(1)
}
// +kubebuilder:scaffold:builder
從這里我們可以看到一個(gè) Manager 就可以管理多個(gè)控制器,所以即使以后再添加一些其他的控制器也沒(méi)什么問(wèn)題,我們要去實(shí)現(xiàn)的部分同樣只是修改 API 結(jié)構(gòu)體,然后在調(diào)諧函數(shù) Reconcile 中實(shí)現(xiàn)業(yè)務(wù)邏輯即可。
接下來(lái)我們就來(lái)具體實(shí)現(xiàn)下 Reconcile 函數(shù)中的邏輯。
調(diào)諧實(shí)現(xiàn)
備份控制器的結(jié)構(gòu)體定義完成了,也將控制器注冊(cè)到了 Manager 中,接下來(lái)我們只需要在 Reconcile 函數(shù)中來(lái)實(shí)現(xiàn)調(diào)諧的邏輯即可。由于備份相當(dāng)于就是一個(gè) Job 任務(wù),所以我們其實(shí)只需要根據(jù)我們期望的狀態(tài)和實(shí)際的狀態(tài)進(jìn)行對(duì)比,然后判斷下一步應(yīng)該做什么操作,當(dāng)然最終是啟動(dòng)一個(gè) Pod 去執(zhí)行備份任務(wù),真正實(shí)現(xiàn)備份任務(wù)的就是這個(gè) Pod 的鏡像了,這個(gè)我們?cè)诤竺嫒?shí)現(xiàn),這里我們首先用一個(gè)任意的鏡像代替即可。
首先我們定義一個(gè)結(jié)構(gòu)體,用來(lái)簡(jiǎn)單封裝包含 EtcdBackup 對(duì)象本身以及真實(shí)和期望的狀態(tài):
// controllers/etcdbackup_controller.go
// backupState 包含 EtcdBackup 真實(shí)和期望的狀態(tài)(這里的狀態(tài)并不是說(shuō)status)
type backupState struct {
backup *etcdv1alpha1.EtcdBackup // EtcdBackup 對(duì)象本身
actual *backupStateContainer // 真實(shí)的狀態(tài)
desired *backupStateContainer // 期望的狀態(tài)
}
// backupStateContainer 包含 EtcdBackup 的狀態(tài)
type backupStateContainer struct {
pod *corev1.Pod
}
然后我們通過(guò) Reconcile 函數(shù)來(lái)獲取 backupState 對(duì)象,根據(jù)對(duì)象的狀態(tài)來(lái)判斷下一步的執(zhí)行動(dòng)作,定義如下所示幾個(gè)函數(shù)來(lái)獲取狀態(tài)對(duì)象,期望的狀態(tài)當(dāng)然就包括我們要去構(gòu)造的一個(gè)執(zhí)行任務(wù)的 Pod 了:
// controllers/etcdbackup_controller.go
// setStateActual 用于設(shè)置 backupState 的真實(shí)狀態(tài)
func (r *EtcdBackupReconciler) setStateActual(ctx context.Context, state *backupState) error {
var actual backupStateContainer
key := client.ObjectKey{
Name: state.backup.Name,
Namespace: state.backup.Namespace,
}
// 獲取對(duì)應(yīng)的 Pod
actual.pod = &corev1.Pod{}
if err := r.Get(ctx, key, actual.pod); err != nil {
if client.IgnoreNotFound(err) != nil {
return fmt.Errorf("getting pod error: %s", err)
}
actual.pod = nil
}
// 填充當(dāng)前真實(shí)的狀態(tài)
state.actual = &actual
return nil
}
// setStateDesired 用于設(shè)置 backupState 的期望狀態(tài)(根據(jù) EtcdBackup 對(duì)象)
func (r *EtcdBackupReconciler) setStateDesired(state *backupState) error {
var desired backupStateContainer
// 創(chuàng)建一個(gè)管理的 Pod 用于執(zhí)行備份操作
pod, err := podForBackup(state.backup, r.BackupAgentImage)
if err != nil {
return fmt.Errorf("computing pod for backup error: %q", err)
}
// 配置 controller reference
if err := controllerutil.SetControllerReference(state.backup, pod, r.Scheme); err != nil {
return fmt.Errorf("setting pod controller reference error : %s", err)
}
desired.pod = pod
// 獲得期望的對(duì)象
state.desired = &desired
return nil
}
// getState 用來(lái)獲取當(dāng)前應(yīng)用的整個(gè)狀態(tài),然后才方便判斷下一步動(dòng)作
func (r EtcdBackupReconciler) getState(ctx context.Context, req ctrl.Request) (*backupState, error) {
var state backupState
// 獲取 EtcdBackup 對(duì)象
state.backup = &etcdv1alpha1.EtcdBackup{}
if err := r.Get(ctx, req.NamespacedName, state.backup); err != nil {
if client.IgnoreNotFound(err) != nil {
return nil, fmt.Errorf("getting backup error: %s", err)
}
// 被刪除了則直接忽略
state.backup = nil
return &state, nil
}
// 獲取當(dāng)前備份的真實(shí)狀態(tài)
if err := r.setStateActual(ctx, &state); err != nil {
return nil, fmt.Errorf("setting actual state error: %s", err)
}
// 獲取當(dāng)前期望的狀態(tài)
if err := r.setStateDesired(&state); err != nil {
return nil, fmt.Errorf("setting desired state error: %s", err)
}
return &state, nil
}
// podForBackup 創(chuàng)建一個(gè) Pod 運(yùn)行備份任務(wù)
func podForBackup(backup *etcdv1alpha1.EtcdBackup, image string) (*corev1.Pod, error) {
// 構(gòu)造一個(gè)全新的備份 Pod
return &corev1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: backup.Name,
Namespace: backup.Namespace,
},
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{
Name: "backup-agent",
Image: image, // todo,執(zhí)行備份的鏡像
Resources: corev1.ResourceRequirements{
Requests: corev1.ResourceList{
corev1.ResourceCPU: resource.MustParse("100m"),
corev1.ResourceMemory: resource.MustParse("50Mi"),
},
Limits: corev1.ResourceList{
corev1.ResourceCPU: resource.MustParse("100m"),
corev1.ResourceMemory: resource.MustParse("50Mi"),
},
},
},
},
RestartPolicy: corev1.RestartPolicyNever,
},
}, nil
}
當(dāng)我們獲取到 backupState 對(duì)象后,需要根據(jù)對(duì)象的狀態(tài)或者里面包含的執(zhí)行任務(wù)的 Pod 狀態(tài)來(lái)決定下一步的動(dòng)作,由于執(zhí)行動(dòng)作有多個(gè),所以我們可以定義一個(gè)接口來(lái)接收不同的動(dòng)作。在 controllers 包下面新建 action.go 文件,文件內(nèi)容如下所示:
// controllers/action.go
package controllers
import (
"context"
"fmt"
"reflect"
"k8s.io/apimachinery/pkg/runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
)
// 定義的執(zhí)行動(dòng)作接口
type Action interface {
Execute(context.Context) error
}
// PatchStatus 用戶(hù)更新對(duì)象 status 狀態(tài)
type PatchStatus struct {
client client.Client
original runtime.Object
new runtime.Object
}
func (o *PatchStatus) Execute(ctx context.Context) error {
if reflect.DeepEqual(o.original, o.new) {
return nil
}
// 更新?tīng)顟B(tài)
if err := o.client.Status().Patch(ctx, o.new, client.MergeFrom(o.original)); err != nil {
return fmt.Errorf("while patching status error %q", err)
}
return nil
}
// CreateObject 創(chuàng)建一個(gè)新的資源對(duì)象
type CreateObject struct {
client client.Client
obj runtime.Object
}
func (o *CreateObject) Execute(ctx context.Context) error {
if err := o.client.Create(ctx, o.obj); err != nil {
return fmt.Errorf("error %q while creating object ", err)
}
return nil
}
我們?cè)?Reconcile 函數(shù)中主要要執(zhí)行的動(dòng)作就是更新備份對(duì)象的狀態(tài)或者創(chuàng)建一個(gè)備份的 Pod,接下來(lái)看看完整的 Reconcile 函數(shù)實(shí)現(xiàn):
// controllers/etcdbackup_controller.go
// +kubebuilder:rbac:groups=etcd.ydzs.io,resources=etcdbackups,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=etcd.ydzs.io,resources=etcdbackups/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch;create
func (r *EtcdBackupReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
ctx := context.Background()
log := r.Log.WithValues("etcdbackup", req.NamespacedName)
// get backup state
state, err := r.getState(ctx, req)
if err != nil {
return ctrl.Result{}, err
}
// 根據(jù)狀態(tài)來(lái)判斷下一步要執(zhí)行的動(dòng)作
var action Action
switch {
case state.backup == nil: // 被刪除了
log.Info("Backup Object not found. Ignoring.")
case !state.backup.DeletionTimestamp.IsZero(): // 標(biāo)記為了刪除
log.Info("Backup Object has been deleted. Ignoring.")
case state.backup.Status.Phase == "": // 開(kāi)始備份,更新?tīng)顟B(tài)
log.Info("Backup Staring. Updating status.")
newBackup := state.backup.DeepCopy() // 深拷貝一份
newBackup.Status.Phase = etcdv1alpha1.EtcdBackupPhaseBackingUp // 更新?tīng)顟B(tài)為備份中
action = &PatchStatus{client: r.Client, original: state.backup, new: newBackup} // 下一步要執(zhí)行的動(dòng)作
case state.backup.Status.Phase == etcdv1alpha1.EtcdBackupPhaseFailed: // 備份失敗
log.Info("Backup has failed. Ignoring.")
case state.backup.Status.Phase == etcdv1alpha1.EtcdBackupPhaseCompleted: // 備份完成
log.Info("Backup has completed. Ignoring.")
case state.actual.pod == nil: // 當(dāng)前還沒(méi)有備份的 Pod
log.Info("Backup Pod does not exists. Creating.")
action = &CreateObject{client: r.Client, obj: state.desired.pod} // 下一步要執(zhí)行的動(dòng)作
case state.actual.pod.Status.Phase == corev1.PodFailed: // 備份Pod執(zhí)行失敗
log.Info("Backup Pod failed. Updating status.")
newBackup := state.backup.DeepCopy()
newBackup.Status.Phase = etcdv1alpha1.EtcdBackupPhaseFailed
action = &PatchStatus{client: r.Client, original: state.backup, new: newBackup} // 下一步更新?tīng)顟B(tài)為失敗
case state.actual.pod.Status.Phase == corev1.PodSucceeded: // 備份Pod執(zhí)行完成
log.Info("Backup Pod succeeded. Updating status.")
newBackup := state.backup.DeepCopy()
newBackup.Status.Phase = etcdv1alpha1.EtcdBackupPhaseCompleted
action = &PatchStatus{client: r.Client, original: state.backup, new: newBackup} // 下一步更新?tīng)顟B(tài)為完成
}
// 執(zhí)行動(dòng)作
if action != nil {
if err := action.Execute(ctx); err != nil {
return ctrl.Result{}, fmt.Errorf("executing action error: %s", err)
}
}
return ctrl.Result{}, nil
}
到這里我們就基本上完成了備份的 CRD 的調(diào)諧過(guò)程開(kāi)發(fā),當(dāng)然還有最最核心的是備份的實(shí)際邏輯還沒(méi)實(shí)現(xiàn),這個(gè)我們只需要單獨(dú)寫(xiě)一個(gè)命令行工具,然后打包成鏡像即可。
此外還有一個(gè)地方需要注意,上面我們使用了 client.Status() 來(lái)更新 EtcdBackup 的狀態(tài),直接這樣更新會(huì)觸發(fā) panic,提示 could not find the requested resource,要解決這個(gè)問(wèn)題需要我們將 Status 設(shè)置為 EtcdBackup 的 subResource,在 EtcdBackup 的結(jié)構(gòu)體上方增加一行 // +kubebuilder:subresource:status 這樣的注釋信息:
// api/v1alpha1/etcdbackup_types.go
// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// EtcdBackup is the Schema for the etcdbackups API
type EtcdBackup struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec EtcdBackupSpec `json:"spec,omitempty"`
Status EtcdBackupStatus `json:"status,omitempty"`
}
更改完成后,記得執(zhí)行 make install 命令重新安裝 CRD,這樣就可以正常去更新 Status 狀態(tài)了。
實(shí)現(xiàn)備份操作
上面我們通過(guò)運(yùn)行一個(gè) Pod 來(lái)執(zhí)行備份任務(wù),那么自然真正實(shí)現(xiàn)備份的就是這個(gè) Pod 里面包含的鏡像了,所以備份的操作邏輯需要我們單獨(dú)實(shí)現(xiàn),然后打包成一個(gè)獨(dú)立的鏡像替換掉上面創(chuàng)建的 Pod 鏡像即可。
在項(xiàng)目根目錄下面創(chuàng)建一個(gè)新的文件 cmd/backup/main.go 用于實(shí)現(xiàn)備份 etcd 集群功能,代碼如下所示:
// cmd/backup/main.go
package main
import (
"context"
"flag"
"fmt"
"os"
"path/filepath"
"time"
"github.com/cnych/etcd-operator/pkg/file"
"github.com/go-logr/logr"
"github.com/go-logr/zapr"
"go.etcd.io/etcd/clientv3"
"go.etcd.io/etcd/clientv3/snapshot"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/log/zap"
)
func loggedError(log logr.Logger, err error, message string) error {
log.Error(err, message)
return fmt.Errorf("%s: %s", message, err)
}
func main() {
var (
backupTempDir string
etcdURL string
etcdDialTimeoutSeconds int64
timeoutSeconds int64
)
flag.StringVar(&backupTempDir, "backup-tmp-dir", os.TempDir(), "The directory to temporarily place backups before they are uploaded to their destination.")
flag.StringVar(&etcdURL, "etcd-url", "http://localhost:2379", "URL for etcd.")
flag.Int64Var(&etcdDialTimeoutSeconds, "etcd-dial-timeout-seconds", 5, "Timeout, in seconds, for dialing the Etcd API.")
flag.Int64Var(&timeoutSeconds, "timeout-seconds", 60, "Timeout, in seconds, of the whole restore operation.")
flag.Parse()
zapLogger := zap.NewRaw(zap.UseDevMode(true))
ctrl.SetLogger(zapr.NewLogger(zapLogger))
log := ctrl.Log.WithName("backup-agent")
ctx, ctxCancel := context.WithTimeout(context.Background(), time.Second*time.Duration(timeoutSeconds))
defer ctxCancel()
log.Info("Connecting to Etcd and getting snapshot")
localPath := filepath.Join(backupTempDir, "snapshot.db")
etcdClient := snapshot.NewV3(zapLogger.Named("etcd-client"))
err := etcdClient.Save(
ctx,
clientv3.Config{
Endpoints: []string{etcdURL},
DialTimeout: time.Second * time.Duration(etcdDialTimeoutSeconds),
},
localPath,
)
if err != nil {
panic(loggedError(log, err, "failed to get etcd snapshot"))
}
// 臨時(shí)測(cè)試
endpoint := "play.min.io"
accessKeyID := "Q3AM3UQ867SPQQA43P2F"
secretAccessKey := "zuf+tfteSlswRu7BJ86wekitnifILbZam1KYY3TG"
s3Uploader := uploader.NewS3Uploader(endpoint, accessKeyID, secretAccessKey)
log.Info("Uploading snapshot")
size, err := s3Uploader.Upload(ctx, localPath)
if err != nil {
panic(loggedError(log, err, "failed to upload backup"))
}
log.WithValues("upload-size", size).Info("Backup complete")
}
這里我們暫時(shí)只是測(cè)試下備份功能,直接使用 minio 的 play.min.io 環(huán)境,所以將 endpoint、acessKey、secretKey 這些暫時(shí)硬編碼。通過(guò) --etcd-url 參數(shù)傳遞的備份 Etcd 集群的地址,將 snapshot 數(shù)據(jù)保存到臨時(shí)目錄中,然后通過(guò) uploader 進(jìn)行上傳,這里正常應(yīng)該通過(guò)我們的 EtcdBackup 傳遞的 CR 屬性來(lái)判斷實(shí)例化 S3 還是 OSS,暫時(shí)我們先只使用 S3,后面再優(yōu)化代碼即可。
新建 pkg/file/s3.go 文件,在文件中實(shí)現(xiàn) minio 上傳功能,直接使用 minio-go 官方的 SDK 即可:
package file
import (
"context"
"github.com/minio/minio-go/v7"
"github.com/minio/minio-go/v7/pkg/credentials"
)
type s3Uploader struct {
Endpoint string
AccessKeyID string
SecretAccessKey string
}
func NewS3Uploader(Endpoint, AK, SK string) *s3Uploader {
return &s3Uploader{
Endpoint: Endpoint,
AccessKeyID: AK,
SecretAccessKey: SK,
}
}
// 初始化 minio client 對(duì)象
func (su *s3Uploader) InitClient() (*minio.Client, error) {
return minio.New(su.Endpoint, &minio.Options{
Creds: credentials.NewStaticV4(su.AccessKeyID, su.SecretAccessKey, ""),
Secure: true,
})
}
func (su *s3Uploader) Upload(ctx context.Context, filePath string) (int64, error) {
client, err := su.InitClient()
if err != nil {
return 0, err
}
bucketName := "testback" // todo
objectName := "etcd-snapshot.db" // todo
uploadInfo, err := client.FPutObject(ctx, bucketName, objectName, filePath, minio.PutObjectOptions{})
if err != nil {
return 0, err
}
return uploadInfo.Size, nil
}
然后將上面的備份任務(wù)打包成一個(gè)新的 Docker 鏡像,修改項(xiàng)目根目錄下面的 Dockerfile 文件,如下所示:
# Build the manager binary
FROM golang:1.13 as builder
RUN apt-get -y update && apt-get -y install upx
WORKDIR /workspace
# Copy the Go Modules manifests
COPY go.mod go.mod
COPY go.sum go.sum
# cache deps before building and copying source so that we don't need to re-download as much
# and so that source changes don't invalidate our downloaded layer
RUN export GOPROXY="https://goproxy.cn" && go mod download
# Copy the go source
COPY main.go main.go
COPY api/ api/
COPY controllers/ controllers/
COPY cmd/ cmd/
COPY pkg/ pkg/
ENV CGO_ENABLED=0
ENV GOOS=linux
ENV GOARCH=amd64
ENV GO111MODULE=on
# Build
RUN go build -mod=readonly -o manager main.go
RUN go build -mod=readonly -o backup cmd/backup/main.go
RUN upx manager backup
# Use distroless as minimal base image to package the manager binary
# Refer to https://github.com/GoogleContainerTools/distroless for more details
FROM gcr.io/distroless/static:nonroot
WORKDIR /
COPY --from=builder /workspace/manager .
USER nonroot:nonroot
ENTRYPOINT ["/manager"]
FROM gcr.io/distroless/static:nonroot as backup
WORKDIR /
COPY --from=builder /workspace/backup .
USER nonroot:nonroot
ENTRYPOINT ["/backup"]
這里我們利用 Docker 的多階段構(gòu)建來(lái)分別構(gòu)建 Operator 以及備份的鏡像,執(zhí)行下面的命令即可構(gòu)建備份的鏡像:
$ docker build --target backup -t cnych/etcd-operator-backup:v0.0.4 -f Dockerfile .
$ docker push cnych/etcd-operator-backup:v0.0.4
然后記得修改 EtcdBackup 控制器生成的 Pod 結(jié)構(gòu):
// controllers/etcdbackup_controller.go
func podForBackup(backup *etcdv1alpha1.EtcdBackup, image string) *corev1.Pod {
return &corev1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: backup.Name,
Namespace: backup.Namespace,
},
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{
Name: "etcd-backup",
Image: image, // todo
Args: []string{
"--etcd-url", backup.Spec.EtcdUrl,
},
......
項(xiàng)目根目錄下面的 main.go 文件修改默認(rèn)備份鏡像地址:
// main.go
var (
scheme = runtime.NewScheme()
setupLog = ctrl.Log.WithName("setup")
defaultBackupImage = "cnych/etcd-operator-backup:v0.0.9" // 修改默認(rèn)備份鏡像地址
)
......
測(cè)試備份
功能實(shí)現(xiàn)完成后,接下來(lái)來(lái)測(cè)試下備份功能,執(zhí)行下面的命令啟動(dòng)控制器:
$ make install
$ make run
啟動(dòng)完成后,創(chuàng)建備份的示例 CR 資源:
$ kubectl apply -f config/samples/etcd_v1alpha1_etcdbackup.yaml
$ kubectl get etcdbackup
NAME AGE
etcdbackup-sample 18h
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
etcd-demo-0 1/1 Running 0 6d21h
etcd-demo-1 1/1 Running 0 6d21h
etcd-demo-2 1/1 Running 0 6d21h
etcdbackup-sample 0/1 Completed 0 18h
查看備份的 Pod 日志也可以看出來(lái)備份成功了,如下圖所示:
基本流程跑通了,接下來(lái)我們只需要去優(yōu)化代碼增加不同的備份類(lèi)型即可。
