搭建Heapster+InfluxDB+Grafana集群性能监控平台
在大规模容器集群中,需要对所有node和全部容器进行性能监控。kubernetes建议使用一套工具来实现集群性能数据的采集、存储和展示:heapster、InfluxDB和Grafana。
heapster:对集群中各个Node上cAdvisor的数据采集汇聚的系统,通过访问每个node上kubelet的API,再通过kubelet调用cAdvisor的API来采集该节点上所有容器的性能数据。Heapster对性能书库进行聚合,并将结果保存到后端存储系统中。
InfluxDB:是分布式时序数据库(每条记录都带有时间戳属性),主要用于实时数据采集、事件跟踪记录、存储时间图表、原始数据等。InfluxDB提供了REST API用于数据的存储和查询。
Grafana:通过dashboard将InfluxDB中的时序数据展现成图表或曲线等形式,便于运维人员查看集群的运行状态。
基于Heapster+InfluxDB+Grafana集群监控系统总体框架如图所示
Heapster+InfluxDB+Grafana均以Pod的形式启动和运行。
1. 上传相关的镜像到私有仓库
[root@kub_master ~]# docker load -i docker_heapster.tar.gz
c12ecfd4861d: Loading layer [==================================================>] 130.9 MB/130.9 MB
5f70bf18a086: Loading layer [==================================================>] 1.024 kB/1.024 kB
998608e2fcd4: Loading layer [==================================================>] 45.16 MB/45.16 MB
591569fa6c34: Loading layer [==================================================>] 126.5 MB/126.5 MB
0b2fe2c6ef6b: Loading layer [==================================================>] 136.2 MB/136.2 MB
f9f3fb66a490: Loading layer [==================================================>] 322.9 MB/322.9 MB
6e2e798f8998: Loading layer [==================================================>] 2.56 kB/2.56 kB
21ac53bc7cd6: Loading layer [==================================================>] 5.632 kB/5.632 kB
7f96c89af577: Loading layer [==================================================>] 79.98 MB/79.98 MB
4371d588893a: Loading layer [==================================================>] 150.2 MB/150.2 MB
Loaded image: docker.io/kubernetes/heapster:canary
[root@kub_master ~]# docker load -i docker_heapster_influxdb.tar.gz
8ceab61e5aa8: Loading layer [==================================================>] 197.2 MB/197.2 MB
3c84ae1bbde2: Loading layer [==================================================>] 208.9 kB/208.9 kB
e8061ac24ae3: Loading layer [==================================================>] 4.608 kB/4.608 kB
5f70bf18a086: Loading layer [==================================================>] 1.024 kB/1.024 kB
58484cf9c5e7: Loading layer [==================================================>] 63.49 MB/63.49 MB
07d2297acddc: Loading layer [==================================================>] 4.608 kB/4.608 kB
Loaded image: docker.io/kubernetes/heapster_influxdb:v0.5
[root@kub_master ~]# docker images |grep heapster_influxdb
docker.io/kubernetes/heapster_influxdb v0.5 a47993810aac 5 years ago 251 MB
[root@kub_master ~]# docker load -i docker_heapster_grafana.tar.gz
c69ae1aa4698: Loading layer [==================================================>] 131 MB/131 MB
5f70bf18a086: Loading layer [==================================================>] 1.024 kB/1.024 kB
75a5b97e491c: Loading layer [==================================================>] 127.1 MB/127.1 MB
e188e2340071: Loading layer [==================================================>] 16.92 MB/16.92 MB
5e43af080be6: Loading layer [==================================================>] 55.81 kB/55.81 kB
7ecd917a174c: Loading layer [==================================================>] 4.096 kB/4.096 kB
Loaded image: docker.io/kubernetes/heapster_grafana:v2.6.0
[root@kub_master ~]# docker images |grep heapster_grafana
docker.io/kubernetes/heapster_grafana v2.6.0 4fe73eb13e50 4 years ago 267 MB
#上传以上三款镜像至私有仓库
[root@kub_master ~]# docker tag docker.io/kubernetes/heapster:canary 192.168.0.212:5000/heapster:canary
[root@kub_master ~]# docker push 192.168.0.212:5000/heapster:canary
The push refers to a repository [192.168.0.212:5000/heapster]
5f70bf18a086: Mounted from kubernetes-dashboard-amd64
4371d588893a: Pushed
7f96c89af577: Pushed
21ac53bc7cd6: Pushed
6e2e798f8998: Pushed
f9f3fb66a490: Pushed
0b2fe2c6ef6b: Pushed
591569fa6c34: Pushed
998608e2fcd4: Pushed
c12ecfd4861d: Pushed
canary: digest: sha256:a024ec78a53b4b9b0bd13b12a6aec29913359e17d30c8d0c35d0a510d10a9d75 size: 4276
[root@kub_master ~]# docker tag docker.io/kubernetes/heapster_grafana:v2.6.0 192.168.0.212:5000/heapster_grafana:v2.6.0
[root@kub_master ~]# docker push 192.168.0.212:5000/heapster_grafana:v2.6.0
The push refers to a repository [192.168.0.212:5000/heapster_grafana]
5f70bf18a086: Mounted from heapster
7ecd917a174c: Pushed
5e43af080be6: Pushed
e188e2340071: Pushed
75a5b97e491c: Pushed
c69ae1aa4698: Pushed
v2.6.0: digest: sha256:1299d1ebb518416f90895a29fc58ad87149af610c8d5d9376c3379d65b6d9568 size: 2811
[root@kub_master ~]# docker tag docker.io/kubernetes/heapster_influxdb:v0.5 192.168.0.212:5000/heapster_influxdb:v0.5
[root@kub_master ~]# docker push 192.168.0.212:5000/heapster_influxdb:v0.5
The push refers to a repository [192.168.0.212:5000/heapster_influxdb]
5f70bf18a086: Mounted from heapster_grafana
07d2297acddc: Pushed
58484cf9c5e7: Pushed
e8061ac24ae3: Pushed
3c84ae1bbde2: Pushed
8ceab61e5aa8: Pushed
v0.5: digest: sha256:32cdba763a3fab92deeb8074b47ace45ee4a15d537e318ceae89f1e2dd069b53 size: 3012
2. 下载heapster资料包
[root@kub_master ~]# cd k8s/
[root@kub_master k8s]# mkdir heapster
[root@kub_master k8s]# cd heapster/
[root@kub_master heapster]# wget https://www.qstack.com.cn/heapster-influxdb.zip
--2020-09-28 20:29:42-- https://www.qstack.com.cn/heapster-influxdb.zip
Resolving www.qstack.com.cn (www.qstack.com.cn)... 180.96.32.89
Connecting to www.qstack.com.cn (www.qstack.com.cn)|180.96.32.89|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2636 (2.6K) [application/zip]
Saving to: ‘heapster-influxdb.zip’
100%[===============================================================================================>] 2,636 --.-K/s in 0s
2020-09-28 20:29:43 (40.6 MB/s) - ‘heapster-influxdb.zip’ saved [2636/2636]
[root@kub_master heapster]# unzip heapster-influxdb.zip
Archive: heapster-influxdb.zip
creating: heapster-influxdb/
inflating: heapster-influxdb/grafana-service.yaml
inflating: heapster-influxdb/heapster-controller.yaml
inflating: heapster-influxdb/heapster-service.yaml
inflating: heapster-influxdb/influxdb-grafana-controller.yaml
inflating: heapster-influxdb/influxdb-service.yaml
[root@kub_master heapster]# cd heapster-influxdb
[root@kub_master heapster-influxdb]# ll
total 20
-rw-r--r-- 1 root root 414 Sep 14 2016 grafana-service.yaml
-rw-r--r-- 1 root root 682 Jul 1 2019 heapster-controller.yaml
-rw-r--r-- 1 root root 249 Sep 14 2016 heapster-service.yaml
-rw-r--r-- 1 root root 1605 Jul 1 2019 influxdb-grafana-controller.yaml
-rw-r--r-- 1 root root 259 Sep 14 2016 influxdb-service.yaml
3. 部署Heapster容器
[root@kub_master heapster-influxdb]# vim heapster-controller.yaml
[root@kub_master heapster-influxdb]# cat heapster-controller.yaml
apiVersion: v1
kind: ReplicationController
metadata:
labels:
k8s-app: heapster
name: heapster
version: v6
name: heapster
namespace: kube-system
spec:
replicas: 1
selector:
k8s-app: heapster
version: v6
template:
metadata:
labels:
k8s-app: heapster
version: v6
spec:
nodeSelector:
kubernetes.io/hostname: 192.168.0.212
containers:
- name: heapster
image: 192.168.0.212:5000/heapster:canary
imagePullPolicy: IfNotPresent
command:
- /heapster
- --source=kubernetes:http://192.168.0.212:8080?inClusterConfig=false #配置采集来源,为master URL地址
- --sink=influxdb:http://monitoring-influxdb:8086 #配置后端存储系统,使用influxdb数据库
[root@kub_master heapster-influxdb]# vim heapster-service.yaml
[root@kub_master heapster-influxdb]# cat heapster-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: Heapster
name: heapster
namespace: kube-system
spec:
ports:
- port: 80
targetPort: 8082
selector:
k8s-app: heapster
[root@kub_master heapster-influxdb]# kubectl create -f heapster-controller.yaml
replicationcontroller "heapster" created
[root@kub_master heapster-influxdb]# kubectl create -f heapster-service.yaml
service "heapster" created
[root@kub_master heapster-influxdb]# kubectl get all --namespace=kube-system
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/kube-dns 1 1 1 1 2d
deploy/kubernetes-dashboard-latest 1 1 1 1 1d
NAME DESIRED CURRENT READY AGE
rc/heapster 1 1 1 26s
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/heapster 192.168.211.92 <none> 80/TCP 11s
svc/kube-dns 192.168.230.254 <none> 53/UDP,53/TCP 2d
svc/kubernetes-dashboard 192.168.108.11 <none> 80/TCP 1d
NAME DESIRED CURRENT READY AGE
rs/kube-dns-4072910292 1 1 1 2d
rs/kubernetes-dashboard-latest-3255858758 1 1 1 1d
NAME READY STATUS RESTARTS AGE
po/heapster-jhr57 1/1 Running 0 26s
po/kube-dns-4072910292-4qb6c 4/4 Running 0 2d
po/kubernetes-dashboard-latest-3255858758-fqsbc 1/1 Running 0 1d
4. 部署InfluxDB和Grafana
#InfluxDB和Grafana的RC定义:
[root@kub_master heapster-influxdb]# vim influxdb-grafana-controller.yaml
[root@kub_master heapster-influxdb]# cat influxdb-grafana-controller.yaml
apiVersion: v1
kind: ReplicationController
metadata:
labels:
name: influxGrafana
name: influxdb-grafana
namespace: kube-system
spec:
replicas: 1
selector:
name: influxGrafana
template:
metadata:
labels:
name: influxGrafana
spec:
containers:
- name: influxdb
image: 192.168.0.212:5000/heapster_influxdb:v0.5
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /data
name: influxdb-storage
- name: grafana
imagePullPolicy: IfNotPresent
image: 192.168.0.212:5000/heapster_grafana:v2.6.0
env:
- name: INFLUXDB_SERVICE_URL
value: http://monitoring-influxdb:8086
# The following env variables are required to make Grafana accessible via
# the kubernetes api-server proxy. On production clusters, we recommend
# removing these env variables, setup auth for grafana, and expose the grafana
# service using a LoadBalancer or a public IP.
- name: GF_AUTH_BASIC_ENABLED
value: "false"
- name: GF_AUTH_ANONYMOUS_ENABLED
value: "true"
- name: GF_AUTH_ANONYMOUS_ORG_ROLE
value: Admin
- name: GF_SERVER_ROOT_URL
value: /api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/
volumeMounts:
- mountPath: /var
name: grafana-storage
nodeSelector:
kubernetes.io/hostname: 192.168.0.212
volumes:
- name: influxdb-storage
emptyDir: {}
- name: grafana-storage
emptyDir: {}
#InfluxDB Service 定义
[root@kub_master heapster-influxdb]# vim influxdb-service.yaml
[root@kub_master heapster-influxdb]# cat influxdb-service.yaml
apiVersion: v1
kind: Service
metadata:
labels: null
name: monitoring-influxdb
namespace: kube-system
spec:
ports:
- name: http
port: 8083
targetPort: 8083
- name: api
port: 8086
targetPort: 8086
selector:
name: influxGrafana
#Grafana Service定义
[root@kub_master heapster-influxdb]# vim grafana-service.yaml
[root@kub_master heapster-influxdb]# cat grafana-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: monitoring-grafana
name: monitoring-grafana
namespace: kube-system
spec:
# In a production setup, we recommend accessing Grafana through an external Loadbalancer
# or through a public IP.
# type: LoadBalancer
ports:
- port: 80
targetPort: 3000
selector:
name: influxGrafana
[root@kub_master heapster-influxdb]# kubectl create -f influxdb-grafana-controller.yaml
replicationcontroller "influxdb-grafana" created
[root@kub_master heapster-influxdb]# kubectl get all --namespace=kube-system
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/kube-dns 1 1 1 1 2d
deploy/kubernetes-dashboard-latest 1 1 1 1 1d
NAME DESIRED CURRENT READY AGE
rc/heapster 1 1 1 4m
rc/influxdb-grafana 1 1 1 5s
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/heapster 192.168.211.92 <none> 80/TCP 4m
svc/kube-dns 192.168.230.254 <none> 53/UDP,53/TCP 2d
svc/kubernetes-dashboard 192.168.108.11 <none> 80/TCP 1d
NAME DESIRED CURRENT READY AGE
rs/kube-dns-4072910292 1 1 1 2d
rs/kubernetes-dashboard-latest-3255858758 1 1 1 1d
NAME READY STATUS RESTARTS AGE
po/heapster-jhr57 1/1 Running 0 4m
po/influxdb-grafana-thgp6 2/2 Running 0 5s
po/kube-dns-4072910292-4qb6c 4/4 Running 0 2d
po/kubernetes-dashboard-latest-3255858758-fqsbc 1/1 Running 0 1d
[root@kub_master heapster-influxdb]# kubectl create -f influxdb-service.yaml
service "monitoring-influxdb" created
[root@kub_master heapster-influxdb]# kubectl create -f grafana-service.yaml
service "monitoring-grafana" created
[root@kub_master heapster-influxdb]# kubectl get all --namespace=kube-system
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/kube-dns 1 1 1 1 2d
deploy/kubernetes-dashboard-latest 1 1 1 1 1d
NAME DESIRED CURRENT READY AGE
rc/heapster 1 1 1 5m
rc/influxdb-grafana 1 1 1 44s
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/heapster 192.168.211.92 <none> 80/TCP 5m
svc/kube-dns 192.168.230.254 <none> 53/UDP,53/TCP 2d
svc/kubernetes-dashboard 192.168.108.11 <none> 80/TCP 1d
svc/monitoring-grafana 192.168.15.46 <none> 80/TCP 3s
svc/monitoring-influxdb 192.168.188.130 <none> 8083/TCP,8086/TCP 10s
NAME DESIRED CURRENT READY AGE
rs/kube-dns-4072910292 1 1 1 2d
rs/kubernetes-dashboard-latest-3255858758 1 1 1 1d
NAME READY STATUS RESTARTS AGE
po/heapster-jhr57 1/1 Running 0 5m
po/influxdb-grafana-thgp6 2/2 Running 0 44s
po/kube-dns-4072910292-4qb6c 4/4 Running 0 2d
po/kubernetes-dashboard-latest-3255858758-fqsbc 1/1 Running 0 1d
5. 访问dashboard
6. 通过cAdvisor页面查看容器的运行状态
在kubernetes系统中,cAdvisor已被默认集成到了kubelet组件内,当kubelet服务启动时,它会自动启动cAdvisor服务,然后cAdvisor会实时采集所在节点的性能指标及在节点上运行的容器的性能指标。kubelet的启动参数--cadvisor-port可自定义cAdvisor对外提供服务的端口号,默认为4194.。
cAdvisor提供了web页面可供浏览器访问。如监控node2节点的性能指标
[root@kub_node2 ~]# vim /etc/kubernetes/kubelet
[root@kub_node2 ~]# cat /etc/kubernetes/kubelet
###
# kubernetes kubelet (minion) config
# The address for the info server to serve on (set to 0.0.0.0 or "" for all interfaces)
KUBELET_ADDRESS="--address=0.0.0.0"
# The port for the info server to serve on
KUBELET_PORT="--port=10250"
# You may leave this blank to use the actual hostname
KUBELET_HOSTNAME="--hostname-override=192.168.0.208"
# location of the api-server
KUBELET_API_SERVER="--api-servers=http://192.168.0.212:8080"
# pod infrastructure container
KUBELET_POD_INFRA_CONTAINER="--pod-infra-container-image=192.168.0.212:5000/pod-infrastructure:latest"
# Add your own!
KUBELET_ARGS="--cluster_dns=192.168.230.254 --cluster_domain=cluster.local --cadvisor-port=4194"
[root@kub_node2 ~]# systemctl restart kubelet
在浏览器中输入http://192.168.0.212:4194来访问cAdvisor的监控页面。
cAdvisor的主页显示了主机的实时运行状态,包括cpu使用情况、内存使用情况、网络吞吐量及文件系统使用情况等信息。
容器的性能数据对于集群监控非常有用,系统管理员可以根据cAdvisor提供的数据进行分析和警告。但是cAdvisor是在每台Node上运行的,只能采集本机的性能指标数据。