原创 SY小站 SY技术小站




k8s控制台修改nginx k8s nginx_nginx 集群部署


1. 前言

一般来说大家都用etcd备份恢复k8s集群,但是有时候我们可能不小心删掉了一个namespace,假设这个ns里面有上百个服务,瞬间没了,怎么办?

当然了,可以用CI/CD系统发布,但是时间会花费很久,这时候,vmvare的Velero出现了。

velero可以帮助我们:

  • 灾备场景,提供备份恢复k8s集群的能力
  • 迁移场景,提供拷贝集群资源到其他集群的能力(复制同步开发,测试,生产环境的集群配置,简化环境配置)

下面我就介绍一下如何使用 Velero 完成备份和迁移。

Velero 地址:https://github.com/vmware-tanzu/veleroACK 插件地址:https://github.com/AliyunContainerService/velero-plugin

2. 下载Velero客户端

Velero 由客户端和服务端组成,服务器部署在目标 k8s 集群上,而客户端则是运行在本地的命令行工具。

  • 前往 Velero 的 Release 页面 下载客户端,直接在 GitHub 上下载即可
  • 解压 release 包
  • 将 release 包中的二进制文件 velero 移动到 $PATH 中的某个目录下
  • 执行 velero -h 测试

3. 部署velero-plugin插件

拉取代码

git clone https://github.com/AliyunContainerService/velero-plugin

配置修改

#修改 `install/credentials-velero` 文件,将新建用户中获得的 `AccessKeyID` 和 `AccessKeySecret` 填入,这里的 OSS EndPoint 为之前 OSS 的访问域名ALIBABA_CLOUD_ACCESS_KEY_ID=ALIBABA_CLOUD_ACCESS_KEY_SECRET=ALIBABA_CLOUD_OSS_ENDPOINT=
#修改 `install/01-velero.yaml`,将 OSS 配置填入:---apiVersion: v1kind: ServiceAccountmetadata:  namespace: velero  name: velero---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRoleBindingmetadata:  labels:    component: velero  name: veleroroleRef:  apiGroup: rbac.authorization.k8s.io  kind: ClusterRole  name: cluster-adminsubjects:- kind: ServiceAccount  name: velero  namespace: velero---apiVersion: velero.io/v1kind: BackupStorageLocationmetadata:  labels:    component: velero  name: default  namespace: velerospec:  config:    region: cn-beijing  objectStorage:    bucket: k8s-backup-test    prefix: test  provider: alibabacloud---apiVersion: velero.io/v1kind: VolumeSnapshotLocationmetadata:  labels:    component: velero  name: default  namespace: velerospec:  config:    region: cn-beijing  provider: alibabacloud---apiVersion: extensions/v1beta1kind: Deploymentmetadata:  name: velero  namespace: velerospec:  replicas: 1  selector:    matchLabels:      deploy: velero  template:    metadata:      annotations:        prometheus.io/path: /metrics        prometheus.io/port: "8085"        prometheus.io/scrape: "true"      labels:        component: velero        deploy: velero    spec:      serviceAccountName: velero      containers:      - name: velero        # sync from velero/velero:v1.2.0        image: registry.cn-hangzhou.aliyuncs.com/acs/velero:v1.2.0        imagePullPolicy: IfNotPresent        command:          - /velero        args:          - server          - --default-volume-snapshot-locations=alibabacloud:default        env:          - name: VELERO_SCRATCH_DIR            value: /scratch          - name: ALIBABA_CLOUD_CREDENTIALS_FILE            value: /credentials/cloud        volumeMounts:          - mountPath: /plugins            name: plugins          - mountPath: /scratch            name: scratch          - mountPath: /credentials            name: cloud-credentials      initContainers:      - image: registry.cn-hangzhou.aliyuncs.com/acs/velero-plugin-alibabacloud:v1.2-991b590        imagePullPolicy: IfNotPresent        name: velero-plugin-alibabacloud        volumeMounts:        - mountPath: /target          name: plugins      volumes:        - emptyDir: {}          name: plugins        - emptyDir: {}          name: scratch        - name: cloud-credentials          secret:            secretName: cloud-credentials

k8s 部署 Velero 服务

# 新建 namespacekubectl create namespace velero# 部署 credentials-velero 的 secretkubectl create secret generic cloud-credentials --namespace velero --from-file cloud=install/credentials-velero# 部署 CRDkubectl apply -f install/00-crds.yaml# 部署 Velerokubectl apply -f install/01-velero.yaml

4. 备份测试

这里,我们将使用velero备份一个集群内相关的resource,并在当该集群出现一些故障或误操作的时候,能够快速恢复集群resource, 首先我们用下面的yaml来部署:

---apiVersion: v1kind: Namespacemetadata:  name: nginx-example  labels:    app: nginx---apiVersion: apps/v1beta1kind: Deploymentmetadata:  name: nginx-deployment  namespace: nginx-examplespec:  replicas: 2  template:    metadata:      labels:        app: nginx    spec:      containers:      - image: nginx:1.7.9        name: nginx        ports:        - containerPort: 80---apiVersion: v1kind: Servicemetadata:  labels:    app: nginx  name: my-nginx  namespace: nginx-examplespec:  ports:  - port: 80    targetPort: 80  selector:    app: nginx

我们可以全量备份,也可以只备份需要备份的一个namespace,本处只备份一个namespace:nginx-example

[rsync@velero-plugin]$ kubectl get pods -n nginx-exampleNAME                                READY   STATUS    RESTARTS   AGEnginx-deployment-5c689d88bb-f8vsx   1/1     Running   0          6m31snginx-deployment-5c689d88bb-rt2zk   1/1     Running   0          6m32s        [rsync@velero]$ cd velero-v1.4.0-linux-amd64/[rsync@velero-v1.4.0-linux-amd64]$ lltotal 56472drwxrwxr-x 4 rsync rsync     4096 Jun  1 15:02 examples-rw-r--r-- 1 rsync rsync    10255 Dec 10 01:08 LICENSE-rwxr-xr-x 1 rsync rsync 57810814 May 27 04:33 velero[rsync@velero-v1.4.0-linux-amd64]$ ./velero backup create nginx-backup --include-namespaces nginx-example --waitBackup request "nginx-backup" submitted successfully.Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background..Backup completed with status: Completed. You may check for more information using the commands `velero backup describe nginx-backup` and `velero backup logs nginx-backup`.


k8s控制台修改nginx k8s nginx_tomcat_02


删除ns

[rsync@velero-v1.4.0-linux-amd64]$ kubectl delete namespaces nginx-examplenamespace "nginx-example" deleted[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-example  No resources found.

恢复

[rsync@velero-v1.4.0-linux-amd64]$ ./velero restore create --from-backup nginx-backup --waitRestore request "nginx-backup-20200603180922" submitted successfully.Waiting for restore to complete. You may safely press ctrl-c to stop waiting - your restore will continue in the background.Restore completed with status: Completed. You may check for more information using the commands `velero restore describe nginx-backup-20200603180922` and `velero restore logs nginx-backup-20200603180922`.[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-exampleNAME                                READY   STATUS              RESTARTS   AGEnginx-deployment-5c689d88bb-f8vsx   1/1     Running             0          5snginx-deployment-5c689d88bb-rt2zk   0/1     ContainerCreating   0          5s可以看到已经恢复了

另外迁移和备份恢复也是一样的,下面看一个特殊的,再部署一个项目,之后恢复会不会删掉新部署的项目。

新建了一个tomcat容器[rsync@tomcat-test]$ kubectl get pods -n nginx-exampleNAME                                READY   STATUS    RESTARTS   AGEnginx-deployment-5c689d88bb-f8vsx   1/1     Running   0          65mnginx-deployment-5c689d88bb-rt2zk   1/1     Running   0          65mtomcat-test-sy-677ff78f6b-rc5vq     1/1     Running   0          7s

restore下

[rsync@velero-v1.4.0-linux-amd64]$ ./velero  restore create --from-backup nginx-backup        Restore request "nginx-backup-20200603191726" submitted successfully.Run `velero restore describe nginx-backup-20200603191726` or `velero restore logs nginx-backup-20200603191726` for more details.[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-example  NAME                                READY   STATUS    RESTARTS   AGEnginx-deployment-5c689d88bb-f8vsx   1/1     Running   0          68mnginx-deployment-5c689d88bb-rt2zk   1/1     Running   0          68mtomcat-test-sy-677ff78f6b-rc5vq     1/1     Running   0          2m33s可以看到没有覆盖

删除nginx的deployment,在restore

[rsync@velero-v1.4.0-linux-amd64]$ kubectl delete deployment nginx-deployment -n nginx-exampledeployment.extensions "nginx-deployment" deleted[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-exampleNAME                              READY   STATUS    RESTARTS   AGEtomcat-test-sy-677ff78f6b-rc5vq   1/1     Running   0          4m18s[rsync@velero-v1.4.0-linux-amd64]$ ./velero  restore create --from-backup nginx-backup Restore request "nginx-backup-20200603191949" submitted successfully.Run `velero restore describe nginx-backup-20200603191949` or `velero restore logs nginx-backup-20200603191949` for more details.[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-example             NAME                                READY   STATUS              RESTARTS   AGEnginx-deployment-5c689d88bb-f8vsx   1/1     Running             0          2snginx-deployment-5c689d88bb-rt2zk   0/1     ContainerCreating   0          2stomcat-test-sy-677ff78f6b-rc5vq     1/1     Running             0          4m49s可以看到,对我们的tomcat项目是没影响的。

结论:velero恢复不是直接覆盖,而是会恢复当前集群中不存在的resource,已有的resource不会回滚到之前的版本,如需要回滚,需在restore之前提前删除现有的resource。

5. 高级用法

可以设置一个周期性定时备份

# 每日1点进行备份velero create schedule  --schedule="0 1 * * *"# 每日1点进行备份,备份保留48小时velero create schedule  --schedule="0 1 * * *" --ttl 48h# 每6小时进行一次备份velero create schedule  --schedule="@every 6h"# 每日对 web namespace 进行一次备份velero create schedule  --schedule="@every 24h" --include-namespaces web
定时备份的名称为:`-`,恢复命令为:`velero restore create --from-backup -`。

如需备份恢复持久卷,备份如下:

velero backup create nginx-backup-volume --snapshot-volumes --include-namespaces nginx-example

该备份会在集群所在region给云盘创建快照(当前还不支持NAS和OSS存储),快照恢复云盘只能在同region完成。

恢复命令如下:

velero  restore create --from-backup nginx-backup-volume --restore-volumes

删除备份

  1. 方法一,通过命令直接删除
velero delete backups default-backup
  1. 方法二,设置备份自动过期,在创建备份时,加上TTL参数
velero backup create  --ttl

还可为资源添加指定标签,添加标签的资源在备份的时候被排除。

# 添加标签kubectl label -n / velero.io/exclude-from-backup=true# 为 default namespace 添加标签kubectl label -n default namespace/default velero.io/exclude-from-backup=true