官方文档地址:https://www.rabbitmq.com/kubernetes/operator/quickstart-operator.html

要求

1.k8s版本要1.18及其以上
2.能够在master节点上使用kubectl命令来管理整个k8s集群
3.有默认的一个StorageClass存储类,因为默认情况下 RabbitMQ Operator 创建的 RabbitMQ 集群会为每个实例使用 StorageClass 分配一个 10G 的 PVC。但是官方提供的yaml文件中并没有设置使用StorageClass存储类的配置,所以需要一个默认的StorageClass存储类。

若没有这个默认的StorageClass存储类,则创建RabbitMQ 集群时创建的pod状态是Pending,具体原因是:pod has unbound immediate PersistentVolumeClaims
查看默认StorageClass储存类,注意default

# kubectl get sc -o wide
NAME                    PROVISIONER                     RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
nfs-164                 nfs-nfs-164                     Delete          WaitForFirstConsumer   false                  37d
rook-ceph-block         rook-ceph.rbd.csi.ceph.com      Delete          Immediate              true                   24h
rook-cephfs (default)   rook-ceph.cephfs.csi.ceph.com   Delete          Immediate              true                   23h
storage                 nfs-storage                     Delete          WaitForFirstConsumer   false                  39d
安装步骤
  • 安装RabbitMQ集群操作员
  • 使用RabbitMQ集群操作员部署RabbitMQ集群
  • 查看RabbitMQ日志
  • 访问RabbitMQ管理UI
  • 设置应用访问RabbitMQ集群
安装

两种方式安装比较

第一种方式是采用插件的方式,详见地址:https://cloud.tencent.com/developer/article/1782766
需要先安装插件管理工具krew,然后才能使用插件的方式进行安装。在安装的过程中会从GitHub上下载东西,访问GitHub慢的不建议采用这种方式

第二种方式是通过yaml文件的方式进行安装,这里采用这种方式进行

使用yaml文件的方式进行安装cluster-operator

yaml文件下载地址:https://github.com/rabbitmq/cluster-operator/releases/download/v1.7.0/cluster-operator.yml

# kubectl create -f cluster-operator.yaml       
namespace/rabbitmq-system created
customresourcedefinition.apiextensions.k8s.io/rabbitmqclusters.rabbitmq.com created
serviceaccount/rabbitmq-cluster-operator created
role.rbac.authorization.k8s.io/rabbitmq-cluster-leader-election-role created
clusterrole.rbac.authorization.k8s.io/rabbitmq-cluster-operator-role created
rolebinding.rbac.authorization.k8s.io/rabbitmq-cluster-leader-election-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/rabbitmq-cluster-operator-rolebinding created
deployment.apps/rabbitmq-cluster-operator created

会创建一个新的名称空间:rabbitmq-system

# kubectl get all -n rabbitmq-system
NAME                                             READY   STATUS    RESTARTS   AGE
pod/rabbitmq-cluster-operator-7c65454ff9-f52b2   1/1     Running   0          32m

NAME                                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/rabbitmq-cluster-operator   1/1     1            1           32m

NAME                                                   DESIRED   CURRENT   READY   AGE
replicaset.apps/rabbitmq-cluster-operator-7c65454ff9   1         1         1       32m

通过上面的显示可以发现有个一脉相承的如下关系:

资源类型:deployment =》 replicaset =》 pod
具体名称 :rabbitmq-cluster-operator =》rabbitmq-cluster-operator-7c65454ff9 =》rabbitmq-cluster-operator-7c65454ff9-td2pm

新的自定义资源rabbitmqclusters.rabbitmq.com。自定义资源允许我们定义用于创建RabbitMQ集群的API。以及一些rbac角色。这些是操作员创建、更新和删除RabbitMQ集群所必需的。

# kubectl get customresourcedefinitions.apiextensions.k8s.io 
NAME                                             CREATED AT
...
rabbitmqclusters.rabbitmq.com                    2021-06-04T02:21:42Z
...

使用cluster-operator创建RabbitMQ集群

简单的yaml文件下载地址:https://github.com/rabbitmq/cluster-operator/tree/main/docs/examples/hello-world
yaml文件名:rabbitmq.yaml
这是最简单的RabbitmqCluster定义。唯一显式指定的属性是集群的名称。其他一切都将根据集群运营商的默认值进行配置。

examples文件夹还有许多其他引用,比如用TLS、mTLS创建RabbitMQ集群,用生产默认值设置集群,添加社区插件等等。下载地址:https://github.com/rabbitmq/cluster-operator/tree/main/docs/examples/

# cat rabbitmq.yaml
apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
  name: my-rabbit-cluster
  namespace: rabbitmq-system

注意:默认没有写名称空间则会部署到defalut中,这里修改成rabbitmq-system

# kubectl create -f rabbitmq.yaml 
rabbitmqcluster.rabbitmq.com/my-rabbit-cluster created

这将在rabbitmq-system命名空间中创建一个名为my-rabbit-cluster的RabbitMQ集群。可以在创建RabbitMQ集群时看到它:

# kubectl get pod -n rabbitmq-system
NAME                                         READY   STATUS    RESTARTS   AGE
my-rabbit-cluster-server-0                   1/1     Running   0          80s

还可以看到创建的rabbitmqclusters.rabbitmq.com自定义资源的实例。

# kubectl get rabbitmqclusters.rabbitmq.com -n rabbitmq-system
NAME                AGE
my-rabbit-cluster   2m33s
查看RabbitMQ日志
# kubectl logs my-rabbit-cluster-server-0 -n rabbitmq-system
 Starting RabbitMQ 3.8.16 on Erlang 24.0.2
 Copyright (c) 2007-2021 VMware, Inc. or its affiliates.
 Licensed under the MPL 2.0. Website: https://rabbitmq.com

  ##  ##      RabbitMQ 3.8.16
  ##  ##
  ##########  Copyright (c) 2007-2021 VMware, Inc. or its affiliates.
  ######  ##
  ##########  Licensed under the MPL 2.0. Website: https://rabbitmq.com

  Doc guides: https://rabbitmq.com/documentation.html
  Support:    https://rabbitmq.com/contact.html
  Tutorials:  https://rabbitmq.com/getstarted.html
  Monitoring: https://rabbitmq.com/monitoring.html

  Logs: <stdout>

  Config file(s): /etc/rabbitmq/rabbitmq.conf
                  /etc/rabbitmq/conf.d/10-operatorDefaults.conf
                  /etc/rabbitmq/conf.d/11-default_user.conf
                  /etc/rabbitmq/conf.d/90-userDefinedConfiguration.conf

  Starting broker...2021-06-04 03:05:59.646 [info] <0.273.0> 
 node           : rabbit@my-rabbit-cluster-server-0.my-rabbit-cluster-nodes.rabbitmq-system
 home dir       : /var/lib/rabbitmq
 config file(s) : /etc/rabbitmq/rabbitmq.conf
                : /etc/rabbitmq/conf.d/10-operatorDefaults.conf
                : /etc/rabbitmq/conf.d/11-default_user.conf
                : /etc/rabbitmq/conf.d/90-userDefinedConfiguration.conf
 cookie hash    : kH25nUaPr1ngafL6UipoTQ==
 log(s)         : <stdout>
 database dir   : /var/lib/rabbitmq/mnesia/rabbit@my-rabbit-cluster-server-0.my-rabbit-cluster-nodes.rabbitmq-system
访问RabbitMQ管理UI

获取用户名和密码

username="$(kubectl get secret my-rabbit-cluster-default-user -n rabbitmq-system -o jsonpath='{.data.username}' | base64 --decode)"
echo "username: $username"  

password="$(kubectl get secret my-rabbit-cluster-default-user -n rabbitmq-system -o jsonpath='{.data.password}' | base64 --decode)"
echo "password: $password"

默认会创建一个类型是ClusterIP的svc

# kubectl get svc -n rabbitmq-system
NAME                      TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                        AGE
my-rabbit-cluster         ClusterIP   10.3.255.186   <none>        5672/TCP,15672/TCP,15692/TCP   8m18s
my-rabbit-cluster-nodes   ClusterIP   None           <none>        4369/TCP,25672/TCP             8m18s

办法一是修改ClusterIP类型为NodePort,使用宿主机IP和nodeport端口来访问

# kubectl edit svc my-rabbit-cluster -n rabbitmq-system

注:这种方式一开始修改后是能够访问使用的,但是删除集群后重新创建并这样设置,就无法修改成NodePort了,很是奇怪。

办法二是使用kube-proxy进行代理来实现访问

# kubectl port-forward svc/my-rabbit-cluster 15672      
Forwarding from 127.0.0.1:15672 -> 15672
Forwarding from [::1]:15672 -> 15672

现在我们可以通过localhost:15672 在打开浏览器并查看管理UI。或者,可以运行curl命令来验证访问:

curl -u$username:$password localhost:15672/api/overview
[root@develop-master-1 ~]# username="$(kubectl get secret my-rabbit-cluster-default-user -n rabbitmq-system -o jsonpath='{.data.username}' | base64 --decode)" && echo "username: $username"
username: YNbrfwcmEqnuJD6BZthuWZP9bTJQuYmW
[root@develop-master-1 ~]# password="$(kubectl get secret my-rabbit-cluster-default-user -n rabbitmq-system -o jsonpath='{.data.password}' | base64 --decode)" && echo "password: $password"
password: 88DzY3WQ51Q2Q9FfWJKrTaQpLUgSfeWP
[root@develop-master-1 ~]# curl -uYNbrfwcmEqnuJD6BZthuWZP9bTJQuYmW:88DzY3WQ51Q2Q9FfWJKrTaQpLUgSfeWP localhost:15672/api/overview
{"management_version":"3.8.16","rates_mode":"basic","sample_retention_policies":{"global":[600,3600,28800,86400],"basic":[600,3600],"detailed":[600]},"exchange_types":[{"name":"direct","description":"AMQP direct exchange, as per the AMQP specification","enabled":true},{"name":"fanout","description":"AMQP fanout exchange, as per the AMQP specification","enabled":true},{"name":"headers","description":"AMQP headers exchange, as per the AMQP specification","enabled":true},{"name":"topic","description":"AMQP topic exchange, as per the AMQP specification","enabled":true}],"product_version":"3.8.16","product_name":"RabbitMQ","rabbitmq_version":"3.8.16","cluster_name":"my-rabbit-cluster","erlang_version":"24.0.2","erlang_full_version":"Erlang/OTP 24 [erts-12.0.2] [source] [64-bit] [smp:2:2] [ds:2:2:10] [async-threads:1]","disable_stats":false,"enable_queue_totals":false,"message_stats":{},"churn_rates":{"channel_closed":0,"channel_closed_details":{"rate":0.0},"channel_created":0,"channel_created_details":{"rate":0.0},"connection_closed":60,"connection_closed_details":{"rate":0.0},"connection_created":0,"connection_created_details":{"rate":0.0},"queue_created":0,"queue_created_details":{"rate":0.0},"queue_declared":0,"queue_declared_details":{"rate":0.0},"queue_deleted":0,"queue_deleted_details":{"rate":0.0}},"queue_totals":{},"object_totals":{"channels":0,"connections":0,"consumers":0,"exchanges":7,"queues":0},"statistics_db_event_queue":0,"node":"rabbit@my-rabbit-cluster-server-0.my-rabbit-cluster-nodes.default","listeners":[{"node":"rabbit@my-rabbit-cluster-server-0.my-rabbit-cluster-nodes.default","protocol":"amqp","ip_address":"::","port":5672,"socket_opts":{"backlog":128,"nodelay":true,"linger":[true,0],"exit_on_close":false}},{"node":"rabbit@my-rabbit-cluster-server-0.my-rabbit-cluster-nodes.default","protocol":"clustering","ip_address":"::","port":25672,"socket_opts":[]},{"node":"rabbit@my-rabbit-cluster-server-0.my-rabbit-cluster-nodes.default","protocol":"http","ip_address":"::","port":15672,"socket_opts":{"cowboy_opts":{"sendfile":false},"port":15672}},{"node":"rabbit@my-rabbit-cluster-server-0.my-rabbit-cluster-nodes.default","protocol":"http/prometheus","ip_address":"::","port":15692,"socket_opts":{"cowboy_opts":{"sendfile":false},"port":15692,"protocol":"http/prometheus"}}],"contexts":[{"ssl_opts":[],"node":"rabbit@my-rabbit-cluster-server-0.my-rabbit-cluster-nodes.default","description":"RabbitMQ Management","path":"/","cowboy_opts":"[{sendfile,false}]","port":"15672"},{"ssl_opts":[],"node":"rabbit@my-rabbit-cluster-server-0.my-rabbit-cluster-nodes.default","description":"RabbitMQ Prometheus","path":"/","cowboy_opts":"[{sendfile,false}]","port":"15692","protocol":"'http/prometheus'"}]}

如下界面是通过使用kuboard代理出来的
k8s中安装rabbitmq集群_json

设置应用访问RabbitMQ集群
# username="$(kubectl get secret my-rabbit-cluster-default-user -n rabbitmq-system -o jsonpath='{.data.username}' | base64 --decode)"
# password="$(kubectl get secret my-rabbit-cluster-default-user -n rabbitmq-system -o jsonpath='{.data.password}' | base64 --decode)"
# service="$(kubectl get service my-rabbit-cluster -n rabbitmq-system -o jsonpath='{.spec.clusterIP}')"
# kubectl run perf-test --image=pivotalrabbitmq/perf-test -- --uri amqp://$username:$password@$service

pod/perf-test created

# kubectl logs --follow perf-test
id: test-041921-512, time: 1.000s, sent: 4720 msg/s, received: 2109 msg/s, min/median/75th/95th/99th consumer latency: 3075/22253/33855/75681/89195 µs
id: test-041921-512, time: 2.000s, sent: 16538 msg/s, received: 13698 msg/s, min/median/75th/95th/99th consumer latency: 91432/157833/203565/344737/356844 µs
id: test-041921-512, time: 3.001s, sent: 18142 msg/s, received: 17832 msg/s, min/median/75th/95th/99th consumer latency: 268145/343634/378988/437575/450074 µs
id: test-041921-512, time: 4.001s, sent: 20991 msg/s, received: 17456 msg/s, min/median/75th/95th/99th consumer latency: 264207/345566/373946/407531/425332 µs
id: test-041921-512, time: 5.015s, sent: 24398 msg/s, received: 13231 msg/s, min/median/75th/95th/99th consumer latency: 358486/701769/755296/803202/832797 µs
id: test-041921-512, time: 6.017s, sent: 14616 msg/s, received: 12665 msg/s, min/median/75th/95th/99th consumer latency: 834423/1083012/1225606/1335011/1374533 µs
id: test-041921-512, time: 7.019s, sent: 16229 msg/s, received: 14603 msg/s, min/median/75th/95th/99th consumer latency: 1335783/1502060/1614123/1700792/1742414 µs
可以看出,perf-test每秒能够产生和消耗大约12000条消息。

k8s中安装rabbitmq集群_github_02

使用Prometheus & Grafana监控rabbitmq集群

官方文档地址:https://www.rabbitmq.com/prometheus.html

问题

1.注意到上述rabbitmq集群中只有一个pod,my-rabbit-cluster-server-0
当扩充增加到2个甚至更多的时候,需要k8s节点资源是否足够。
我这边在扩充到2个时会报错:0/3 nodes are available: 3 Insufficient cpu.

原因是k8s节点资源资源不足。
创建rabbitmq集群时使用的是默认配置,默认配置中资源要求如下:

Limits:
  cpu:     2
  memory:  2Gi
Requests:
  cpu:      1
  memory:   2Gi

自动生成第二个pod时查看的详情如下:

Name:           my-rabbit-cluster-server-1
Namespace:      rabbitmq-system
Priority:       0
Node:           <none>
Labels:         app.kubernetes.io/component=rabbitmq
                app.kubernetes.io/name=my-rabbit-cluster
                app.kubernetes.io/part-of=rabbitmq
                controller-revision-hash=my-rabbit-cluster-server-b7d484587
                statefulset.kubernetes.io/pod-name=my-rabbit-cluster-server-1
Annotations:    prometheus.io/port: 15692
                prometheus.io/scrape: true
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  StatefulSet/my-rabbit-cluster-server
Init Containers:
  setup-container:
    Image:      rabbitmq:3.8.16-management
    Port:       <none>
    Host Port:  <none>
    Command:
      sh
      -c
      cp /tmp/erlang-cookie-secret/.erlang.cookie /var/lib/rabbitmq/.erlang.cookie && chown 999:999 /var/lib/rabbitmq/.erlang.cookie && chmod 600 /var/lib/rabbitmq/.erlang.cookie ; cp /tmp/rabbitmq-plugins/enabled_plugins /operator/enabled_plugins && chown 999:999 /operator/enabled_plugins ; chown 999:999 /var/lib/rabbitmq/mnesia/ ; echo '[default]' > /var/lib/rabbitmq/.rabbitmqadmin.conf && sed -e 's/default_user/username/' -e 's/default_pass/password/' /tmp/default_user.conf >> /var/lib/rabbitmq/.rabbitmqadmin.conf && chown 999:999 /var/lib/rabbitmq/.rabbitmqadmin.conf && chmod 600 /var/lib/rabbitmq/.rabbitmqadmin.conf
    Limits:
      cpu:     100m
      memory:  500Mi
    Requests:
      cpu:        100m
      memory:     500Mi
    Environment:  <none>
    Mounts:
      /operator from rabbitmq-plugins (rw)
      /tmp/default_user.conf from rabbitmq-confd (rw,path="default_user.conf")
      /tmp/erlang-cookie-secret/ from erlang-cookie-secret (rw)
      /tmp/rabbitmq-plugins/ from plugins-conf (rw)
      /var/lib/rabbitmq/ from rabbitmq-erlang-cookie (rw)
      /var/lib/rabbitmq/mnesia/ from persistence (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from my-rabbit-cluster-server-token-vgvp4 (ro)
Containers:
  rabbitmq:
    Image:       rabbitmq:3.8.16-management
    Ports:       4369/TCP, 5672/TCP, 15672/TCP, 15692/TCP
    Host Ports:  0/TCP, 0/TCP, 0/TCP, 0/TCP
    Limits:
      cpu:     2
      memory:  2Gi
    Requests:
      cpu:      1
      memory:   2Gi
    Readiness:  tcp-socket :amqp delay=10s timeout=5s period=10s #success=1 #failure=3
    Environment:
      MY_POD_NAME:                    my-rabbit-cluster-server-1 (v1:metadata.name)
      MY_POD_NAMESPACE:               default (v1:metadata.namespace)
      RABBITMQ_ENABLED_PLUGINS_FILE:  /operator/enabled_plugins
      K8S_SERVICE_NAME:               my-rabbit-cluster-nodes
      RABBITMQ_USE_LONGNAME:          true
      RABBITMQ_NODENAME:              rabbit@$(MY_POD_NAME).$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE)
      K8S_HOSTNAME_SUFFIX:            .$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE)
    Mounts:
      /etc/pod-info/ from pod-info (rw)
      /etc/rabbitmq/conf.d/10-operatorDefaults.conf from rabbitmq-confd (rw,path="operatorDefaults.conf")
      /etc/rabbitmq/conf.d/11-default_user.conf from rabbitmq-confd (rw,path="default_user.conf")
      /etc/rabbitmq/conf.d/90-userDefinedConfiguration.conf from rabbitmq-confd (rw,path="userDefinedConfiguration.conf")
      /operator from rabbitmq-plugins (rw)
      /var/lib/rabbitmq/ from rabbitmq-erlang-cookie (rw)
      /var/lib/rabbitmq/mnesia/ from persistence (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from my-rabbit-cluster-server-token-vgvp4 (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  persistence:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  persistence-my-rabbit-cluster-server-1
    ReadOnly:   false
  plugins-conf:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      my-rabbit-cluster-plugins-conf
    Optional:  false
  rabbitmq-confd:
    Type:                Projected (a volume that contains injected data from multiple sources)
    SecretName:          my-rabbit-cluster-default-user
    SecretOptionalName:  <nil>
    ConfigMapName:       my-rabbit-cluster-server-conf
    ConfigMapOptional:   <nil>
  rabbitmq-erlang-cookie:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  erlang-cookie-secret:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  my-rabbit-cluster-erlang-cookie
    Optional:    false
  rabbitmq-plugins:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  pod-info:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.labels['skipPreStopChecks'] -> skipPreStopChecks
  my-rabbit-cluster-server-token-vgvp4:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  my-rabbit-cluster-server-token-vgvp4
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  33s (x2 over 33s)  default-scheduler  0/3 nodes are available: 3 Insufficient cpu.