• 简单使用
  • 通过StorageClass动态分配
  • 存储卷分配方式
  • 静态分配
  • 动态分配
  • 踩坑


简单使用

关于k8s 简单实用ceph, 以下两篇文章给出了详细步骤和要注意的事项:

https://github.com/kubernetes/kubernetes/tree/master/examples/volumes/cephfs

http://tonybai.com/2017/05/08/mount-cephfs-acrossing-nodes-in-kubernetes-cluster/

  1. 生成ceph-secret时候,要对/etc/ceph/admin.secret中的秘钥进行base64编码
  2. 需要提前手动创建 rbd
  3. 对于内核版本较低,导致无法支持rbd的挂载特性,可以在创建rbd时指定,或者对ceph配置文件修改。

通过上述文档创建的pod,其所挂载的rbd不会随pod的删除而消失。
例如node 1上有 POD A, 挂载了rbd B。在node 1 上可通过 mount 命令观察到rbd的挂载情况。在POD A 被删除后,rbd B 仍挂载在node 1 上。

若此时又创建了 POD C(挂载rbd B), 并且POD C 被调度到 node 2上时。可以观察到rbd Bnode 1 转移到node 2上。 并且之前的数据依然存在。

通过StorageClass动态分配

存储卷分配方式

静态分配

静态分配就是提前手动创建好 pv 资源。

动态分配

kubernetes的新特性之一,allows storage volumes to be created on-demand。利用这个新特性就不用每次都手动创建PV了。动态分配需要配置StorageClass,在创建pvc的时候需要指定StorageClass

在pvc的配置项中,有个storageClassName的配置字段。该字段填写StorageClass的名字,如果该字段为""。那就表示禁用动态分配功能。

不定义storageClasskNamestorageClassName:是完全不同的。对于第一种情况,kubernetes会使用默认的StorageClass来为用户动态分配存储卷(需要定义某个StorageClassDefaultStorageClass)。第二种表示完全禁用。如果用户没有定义默认存储对象的话呢,那这两者就没啥区别了。

https://kubernetes.io/docs/concepts/storage/persistent-volumes/#dynamic中有详细介绍。

踩坑

参照官方文档做法如下:
1. 生成ceph-secret:

[root@walker-2 ~]# cat /etc/ceph/admin.secret | base64
QVFEdmRsTlpTSGJ0QUJBQUprUXh4SEV1ZGZ5VGNVa1U5cmdWdHc9PQo=
[root@walker-2 ~]# kubectl create secret generic ceph-secret --type="kubernetes.io/rbd" --from-literal=key='QVFEdmRsTlpTSGJ0QUJBQUprUXh4SEV1ZGZ5VGNVa1U5cmdWdHc9PQo=' --namespace=kube-system
  1. StorageClass 配置:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: axiba
provisioner: kubernetes.io/rbd
parameters:
  monitors: 172.16.18.5:6789,172.16.18.6:6789,172.16.18.7:6789
  adminId: admin
  adminSecretName: ceph-secret
  adminSecretNamespace: hehe
  pool: rbd
  userId: admin
  userSecretName: ceph-secret
  fsType: ext4
  imageFormat: "2"
  imageFeatures: "layering"
  • adminId: ceph 客户端ID,能够在pool中创建image, 默认是admin
  • adminSecretName: ceph客户端秘钥名(第一步中创建的秘钥)
  • adminSecretNamespace: 秘钥所在的namespace
  • pool: ceph集群中的pool, 可以通过命令 ceph osd pool ls查看
  • userId: 使用rbd的用户ID,默认admin
  • userSecretName: 同上
  • imageFormat: ceph RBD 格式。”1” 或者 “2”,默认为”1”. 2支持更多rbd特性
  • imageFeatures: 只有格式为”2”时,才有用。默认为空值(“”)
  1. pvc 配置:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: axiba
  namespace: hehe
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 4Gi
  storageClassName: axiba
  1. pod 配置:
apiVersion: v1
kind: Pod
metadata:
  name: axiba
  namespace: hehe
spec:
  containers:
  - image: nginx:latest
    imagePullPolicy: IfNotPresent
    name: nginx
    resources: {}
    volumeMounts:
    - name: axiba
      mountPath: /usr/share/nginx/html
  volumes:
    - name: axiba
      persistentVolumeClaim:
        claimName: axiba

通过上述配置创建pod后,发现pvc总是处于 pending 状态:

[root@walker-1 ~]# kubectl describe pvc axiba --namespace=hehe
Name:          axiba
Namespace:     hehe
StorageClass:  axiba
Status:        Pending
Volume:        
Labels:        <none>
Annotations:   kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"axiba","namespace":"hehe"},"spec":{"accessModes":["ReadWriteOnce...
               volume.beta.kubernetes.io/storage-provisioner=kubernetes.io/rbd
Capacity:      
Access Modes:  
Events:
  Type     Reason              Age                  From                         Message
  ----     ------              ----                 ----                         -------
  Warning  ProvisioningFailed  32s (x1701 over 7h)  persistentvolume-controller  Failed to provision volume with StorageClass "axiba": failed to create rbd image: executable file not found in $PATH, command output:

提示找不到可执行文件。但是我在每台节点都已经安装了ceph。

[root@walker-1 ~]# which rbd
/usr/bin/rbd

遂google之,定位到#issue:38923

  • Volume Provisioning: Currently, if you want dynamic provisioning, RBD provisioner in controller-manager needs to access rbd binary to create new image in ceph cluster for your PVC.
    external-storage plans to move volume provisioners from in-tree to out-of-tree, there will be a separated RBD provisioner container image with rbd utility included (kubernetes-incubator/external-storage#200), then controller-manager do not need access rbd binary anymore.
  • Volume Attach/Detach: kubelet needs to access rbd binary to attach (rbd map) and detach (rbd unmap) RBD image on node. If kubelet is running on the host, host needs to install rbd utility (install ceph-common package on most Linux distributions).

大致原因就是,负责动态创建RBD的controller-manager容器中没有安装ceph,导致命令不可用。

https://github.com/kubernetes-incubator/external-storage/issues/200

对于RBD provisioner的支持从 in-tree移动到了out-of-tree

因此需要考虑out-of-tree的解决方案

https://github.com/kubernetes/kubernetes/issues/38923

首先创建一个rbd-provisioner的deployment

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: rbd-provisioner
  namespace: kube-system
spec:
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: rbd-provisioner
    spec:
      serviceAccountName: rbd-provision
      containers:
      - name: rbd-provisioner
        image: "quay.io/external_storage/rbd-provisioner:v0.1.1"
        env:
        - name: PROVISIONER_NAME
          value: ceph.com/rbd

截止到2017/11/09,镜像的最新版是v0.1.1。可通过[https://quay.io/repository/external_storage/rbd-provisioner] 来查看最新镜像。

我做了一些小改动,将rbd-provisioner放到了kube-system 的namespace中。并且为其创建了rbd-provisoinServiceAccount,不然在启动该deployment之后,通过kubectl log rbd-provisionxxx -f 就会提示:

Failed to list *v1.PersistentVolumeClaim: User "system:serviceaccount:kube-syst
em:default" cannot list persistentvolumeclaims at the cluster scope. (get persistentvolumeclaims)

此类的权限问题。

rabc 授权配置如下:

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rbd-provision
subjects:
- kind: ServiceAccount
  name: rbd-provision
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: system:controller:persistent-volume-binder
  apiGroup: rbac.authorization.k8s.io
---
kind: ServiceAccount
apiVersion: v1
metadata: 
  name: rbd-provision
  namespace: kube-system

同时修改 sc 配置:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: axiba
provisioner: ceph.com/rbd
parameters:
  monitors: 172.16.18.5:6789,172.16.18.6:6789,172.16.18.7:6789
  adminId: admin
  adminSecretName: ceph-secret
  adminSecretNamespace: hehe
  pool: rbd
  userId: admin
  userSecretName: ceph-secret
  imageFormat: "2"
  imageFeatures: "layering"

No need to and do not add fsType: ext4 to storage class. 这是作者原话,具体原因不知。默认以ext4格式挂载。

在建好之后,发现自动创建了rbd和pv

[root@walker-1 hehe]# kubectl describe pvc axiba --namespace=hehe
Name:          axiba
Namespace:     hehe
StorageClass:  axiba
Status:        Bound
Volume:        pvc-61785500-c434-11e7-96a0-fa163e028b17
Labels:        <none>
Annotations:   control-plane.alpha.kubernetes.io/leader={"holderIdentity":"8444d083-c453-11e7-9d04-b2667a809e20","leaseDurationSeconds":15,"acquireTime":"2017-11-08T07:07:43Z","renewTime":"2017-11-08T07:08:14Z","lea...
               kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"axiba","namespace":"hehe"},"spec":{"accessModes":["ReadWriteOnce...
               pv.kubernetes.io/bind-completed=yes
               pv.kubernetes.io/bound-by-controller=yes
               volume.beta.kubernetes.io/storage-provisioner=ceph.com/rbd
Capacity:      4Gi
Access Modes:  RWO
Events:        <none>
[root@walker-1 hehe]# kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM        STORAGECLASS   REASON    AGE
pvc-61785500-c434-11e7-96a0-fa163e028b17   4Gi        RWO            Delete           Bound     hehe/axiba   axiba                    20h
[root@walker-1 hehe]# rbd ls
kubernetes-dynamic-pvc-844efd73-c453-11e7-9d04-b2667a809e20

但是pod挂载存储卷的时候,却提示超时:

[root@walker-1 hehe]# kubectl describe po axiba --namespace=hehe
Name:         axiba
Namespace:    hehe
...

Events:
  Type     Reason                 Age              From                         Message
  ----     ------                 ----             ----                         -------
  Normal   Scheduled              11m              default-scheduler            Successfully assigned aaa to walker-4.novalocal
  Normal   SuccessfulMountVolume  11m              kubelet, walker-4.novalocal  MountVolume.SetUp succeeded for volume "default-token-sx9pb"
  Warning  FailedMount            7m (x2 over 9m)  kubelet, walker-4.novalocal  Unable to mount volumes for pod "aaa_default(e34ec7f7-c528-11e7-96a0-fa163e028b17)": timeout expired waiting for volumes to attach/mount for pod "default"/"aaa". list of unattached/unmounted volumes=[aaa]
  Warning  FailedSync             7m (x2 over 9m)  kubelet, walker-4.novalocal  Error syncing pod
  Normal   SuccessfulMountVolume  6m (x2 over 6m)  kubelet, walker-4.novalocal  MountVolume.SetUp succeeded for volume "pvc-597c8a48-c503-11e7-96a0-fa163e028b17"
  Normal   Pulled                 6m               kubelet, walker-4.novalocal  Container image "nginx:latest" already present on machine
  Normal   Created                6m               kubelet, walker-4.novalocal  Created container
  Normal   Started                6m               kubelet, walker-4.novalocal  Started container

尝试了若干次,还是卡在同样的错误上。issue上也没找到相关解释。可以确保的是ceph配置正确,通过手动可以查看rbd信息,以及挂载。(能够自动创建rbd,也说明配置没问题)。 神奇的是过一段时间后,自动恢复了。pod成功挂载了rbd, anyway,持续观望中。