作者: 李毓
Kube-prometheus 是一个完整的 Kubernetes 监控解决方案,它使用 Prometheus 及其相关工具(如 Alertmanager、Grafana 和各种 Exporter)来实现 Kubernetes 集群的监控和告警功能。Kube-prometheus 是由 Prometheus 社区提供的官方项目,它提供了一个开箱即用的 Kubernetes 监控栈。
相比于直接使用 Helm Charts 部署 Prometheus 和 Prometheus Operator,Kube-prometheus 有以下优势:
- 完整的监控栈 Kube-prometheus 提供了一个完整的监控解决方案,包含:
Prometheus: 用于收集和存储时间序列数据。 Alertmanager: 用于处理和路由告警。 Grafana: 用于数据可视化。 Node Exporter: 用于收集主机级别的指标。 Kube-state-metrics: 用于收集 Kubernetes 资源的状态指标。 Prometheus Adapter: 用于基于自定义指标的自动扩展。 这种整合使得用户可以快速部署一个完整的监控栈,而无需逐个组件配置和整合。
-
官方支持 Kube-prometheus 是由 Prometheus 社区官方维护的项目,保证了项目的稳定性和更新。相比之下,Helm Charts 的维护可能由不同的社区或个人完成,质量和更新频率可能有所不同。
-
易于定制和扩展 Kube-prometheus 使用 jsonnet 作为配置语言,提供了一种强大且灵活的方式来定制和扩展监控栈。用户可以基于现有配置进行修改,添加自定义的监控和告警规则,而不需要从头开始配置所有组件。
-
标准化的部署流程 Kube-prometheus 提供了一套标准化的部署流程,包括所有组件的安装、配置和集成。这减少了用户在部署和配置过程中的工作量和复杂性,尤其适用于需要快速启动监控环境的场景。
-
预配置的监控和告警规则 Kube-prometheus 附带了一些预配置的监控和告警规则,这些规则基于社区最佳实践,可以帮助用户更快速地实现对 Kubernetes 集群的监控和告警。这些规则涵盖了常见的 Kubernetes 资源和组件,如节点、Pod、服务等。
下面我们开始部署一套完整的kube-prometheus,目前最新版本是0.13,但是由于kube-prometheus和K8S版本兼容比较密切,我目前的K8S版本仍然是1.20,这里没有用最新的0.13演示,请参考下面的版本匹配表。
kube-prometheus stack | Kubernetes 1.16 | Kubernetes 1.17 | Kubernetes 1.18 | Kubernetes 1.19 | Kubernetes 1.20 |
---|---|---|---|---|---|
release-0.4 |
✔ (v1.16.5+) | ✔ | ✗ | ✗ | ✗ |
release-0.5 |
✗ | ✗ | ✔ | ✗ | ✗ |
release-0.6 |
✗ | ✗ | ✔ | ✔ | ✗ |
release-0.7 |
✗ | ✗ | ✗ | ✔ | ✔ |
HEAD |
✗ | ✗ | ✗ | ✔ | ✔ |
所以,直接先去官网下载一个0.7版本的kube-prometheus。
https://github.com/prometheus-operator/kube-prometheus/archive/refs/tags/v0.7.0.zip
tar -zxvf kube-prometheus-0.7.0.tar.gz
cd kube-prometheus-0.7.0/manifests
先写个shell把所有镜像改成国内的方便拉取
#!/bin/bash
# 遍历当前目录下所有后缀为.yaml的文件
for file in *.yaml; do
# 使用sed命令进行替换操作
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' "$file"
done
新建目录进行分类
mkdir -pv serviceMonitor prometheus adapter node-exporter kube-state-metrics grafana alertmanager operator other
mv *-serviceMonitor* serviceMonitor/
mv grafana-* grafana/
mv kube-state-metrics-* kube-state-metrics/
mv alertmanager-* alertmanager/
mv node-exporter-* node-exporter/
mv prometheus-adapter* adapter/
mv prometheus-* prometheus/
最终结构如下:
root@apulis:~/kube-prometheus-0.7.0/manifests# tree
.
├── adapter
│ ├── prometheus-adapter-apiService.yaml
│ ├── prometheus-adapter-clusterRoleAggregatedMetricsReader.yaml
│ ├── prometheus-adapter-clusterRoleBindingDelegator.yaml
│ ├── prometheus-adapter-clusterRoleBinding.yaml
│ ├── prometheus-adapter-clusterRoleServerResources.yaml
│ ├── prometheus-adapter-clusterRole.yaml
│ ├── prometheus-adapter-configMap.yaml
│ ├── prometheus-adapter-deployment.yaml
│ ├── prometheus-adapter-roleBindingAuthReader.yaml
│ ├── prometheus-adapter-serviceAccount.yaml
│ ├── prometheus-adapter-serviceMonitor.yaml
│ └── prometheus-adapter-service.yaml
├── alertmanager
│ ├── alertmanager-alertmanager.yaml
│ ├── alertmanager-secret.yaml
│ ├── alertmanager-serviceAccount.yaml
│ ├── alertmanager-serviceMonitor.yaml
│ └── alertmanager-service.yaml
├── grafana
│ ├── grafana-dashboardDatasources.yaml
│ ├── grafana-dashboardDefinitions.yaml
│ ├── grafana-dashboardSources.yaml
│ ├── grafana-deployment.yaml
│ ├── grafana-pvc.yaml
│ ├── grafana-serviceAccount.yaml
│ ├── grafana-serviceMonitor.yaml
│ └── grafana-service.yaml
├── kube-state-metrics
│ ├── kube-state-metrics-clusterRoleBinding.yaml
│ ├── kube-state-metrics-clusterRole.yaml
│ ├── kube-state-metrics-deployment.yaml
│ ├── kube-state-metrics-serviceAccount.yaml
│ ├── kube-state-metrics-serviceMonitor.yaml
│ └── kube-state-metrics-service.yaml
├── node-exporter
│ ├── node-exporter-clusterRoleBinding.yaml
│ ├── node-exporter-clusterRole.yaml
│ ├── node-exporter-daemonset.yaml
│ ├── node-exporter-serviceAccount.yaml
│ ├── node-exporter-serviceMonitor.yaml
│ └── node-exporter-service.yaml
├── operator
├── other
├── prometheus
│ ├── prometheus-clusterRoleBinding.yaml
│ ├── prometheus-clusterRole.yaml
│ ├── prometheus-prometheus.yaml
│ ├── prometheus-roleBindingConfig.yaml
│ ├── prometheus-roleBindingSpecificNamespaces.yaml
│ ├── prometheus-roleConfig.yaml
│ ├── prometheus-roleSpecificNamespaces.yaml
│ ├── prometheus-rules.yaml
│ ├── prometheus-serviceAccount.yaml
│ └── prometheus-service.yaml
├── serviceMonitor
│ ├── prometheus-operator-serviceMonitor.yaml
│ ├── prometheus-serviceMonitorApiserver.yaml
│ ├── prometheus-serviceMonitorCoreDNS.yaml
│ ├── prometheus-serviceMonitorKubeControllerManager.yaml
│ ├── prometheus-serviceMonitorKubelet.yaml
│ ├── prometheus-serviceMonitorKubeScheduler.yaml
│ └── prometheus-serviceMonitor.yaml
├── setup
│ ├── 0namespace-namespace.yaml
│ ├── prometheus-operator-0alertmanagerConfigCustomResourceDefinition.yaml
│ ├── prometheus-operator-0alertmanagerCustomResourceDefinition.yaml
│ ├── prometheus-operator-0podmonitorCustomResourceDefinition.yaml
│ ├── prometheus-operator-0probeCustomResourceDefinition.yaml
│ ├── prometheus-operator-0prometheusCustomResourceDefinition.yaml
│ ├── prometheus-operator-0prometheusruleCustomResourceDefinition.yaml
│ ├── prometheus-operator-0servicemonitorCustomResourceDefinition.yaml
│ ├── prometheus-operator-0thanosrulerCustomResourceDefinition.yaml
│ ├── prometheus-operator-clusterRoleBinding.yaml
│ ├── prometheus-operator-clusterRole.yaml
│ ├── prometheus-operator-deployment.yaml
│ ├── prometheus-operator-serviceAccount.yaml
│ └── prometheus-operator-service.yaml
└── storage
├── prometheus-nfs-clusterrolebinding.yaml
├── prometheus-nfs-deployment.yaml
├── prometheus-nfs-serviceaccount.yaml
└── prometheus-nfs-storageclass.yaml
11 directories, 74 files
由于kube-prometheus默认用的是emptyDir存储,这样在删除pod以后会导致数据丢失,所以我们一定要改成持久化存储,我这里采用了nfs存储,并且通过storageclass存储类达到数据持久化的目的。
部署nfs
1.部署NFS
apt install -y nfs-kernel-server rpcbind
2.设置NFS目录路径
echo "/k8s-data *(rw,sync,crossmnt,no_subtree_check,no_root_squash)" > /etc/exports
3.创建目录
mkdir /k8s-data
4.启动NFS服务
systemctl start rpcbind
systemctl start nfs-kernel-server
5.设置NFS开机自启动
systemctl enable rpcbind
systemctl enable nfs-kernel-server
6.生效NFS配置
exportfs -arv
7.创建数据存储目录
mkdir -p /k8s-data/prometheus/prometheus-data
创建StorageClass的(Provisioner)
创建ServiceAccount账号
cat <<END > prometheus-nfs-serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus-nfs-provisioner
namespace: monitoring
END
ServiceAccount账号绑定角色进行授权
cat <<END > prometheus-nfs-clusterrolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus-nfs-provisioner-clusterrolebinding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: prometheus-nfs-provisioner
namespace: monitoring
END
部署NFS客户端自动装载程序
cat <<END > prometheus-nfs-deployment.yaml
kind: Deployment
apiVersion: apps/v1
metadata:
name: prometheus-nfs-provisioner
namespace: monitoring
spec:
selector:
matchLabels:
app: prometheus-nfs-provisioner
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: prometheus-nfs-provisioner
spec:
serviceAccount: prometheus-nfs-provisioner #指定ServiceAccount账号
containers:
- name: nfs-provisioner
image: registry.cn-beijing.aliyuncs.com/mydlq/nfs-subdir-external-provisioner:v4.0.0
imagePullPolicy: IfNotPresent
volumeMounts:
- name: prometheus-nfs-client
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: k8s.prometheus/nfs # 设置NFS供应商名称
- name: NFS_SERVER
value: 192.168.4.77 # NFS服务端地址
- name: NFS_PATH
value: /k8s-data/prometheus/prometheus-data # NFS共享目录
volumes:
- name: prometheus-nfs-client
nfs:
server: 192.168.4.77 # NFS服务端地址
path: /k8s-data/prometheus/prometheus-data # NFS共享目录
END
创建StorageClass存储类对象
cat <<END > prometheus-nfs-storageclass.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: prometheus-data-db
provisioner: k8s.prometheus/nfs #指定NFS供应商名称
reclaimPolicy: Retain #设置删除PVC时保留数据文件
END
执行部署
kubectl create ns monitoring
kubectl apply -f storage/
检查服务是否正常
root@apulis:~/kube-prometheus-0.7.0/manifests# kubectl get sa -n monitoring prometheus-nfs-provisioner
NAME SECRETS AGE
prometheus-nfs-provisioner 1 95m
root@apulis:~/kube-prometheus-0.7.0/manifests# kubectl get clusterrolebindings.rbac.authorization.k8s.io -n monitoring prometheus-nfs-provisioner-clusterrolebinding
NAME ROLE AGE
prometheus-nfs-provisioner-clusterrolebinding ClusterRole/cluster-admin 96m
root@apulis:~/kube-prometheus-0.7.0/manifests# kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 78m
alertmanager-main-1 2/2 Running 0 78m
alertmanager-main-2 2/2 Running 0 78m
grafana-5d65487c86-vrtzv 1/1 Running 0 40m
kube-state-metrics-58c88f48b7-wnwhh 3/3 Running 0 78m
node-exporter-kksm7 2/2 Running 0 78m
prometheus-adapter-69b8496df6-5mgc9 1/1 Running 0 78m
prometheus-k8s-0 2/2 Running 1 78m
prometheus-k8s-1 2/2 Running 1 78m
prometheus-nfs-provisioner-57b9b9b9b5-mdffk 1/1 Running 0 73m
prometheus-operator-7649c7454f-tq585 2/2 Running 0 78m
root@apulis:~/kube-prometheus-0.7.0/manifests# kubectl get sc -n monitoring
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
prometheus-data-db k8s.prometheus/nfs Retain Immediate false 96m
修改kube-prometheus部署配置 prometheus的部署文件中添加以下持久化存储配置
storage: #新增持久化存储配置
volumeClaimTemplate:
spec:
storageClassName: prometheus-data-db
resources:
requests:
storage: 50Gi
完整配置文件如下:
root@apulis:~/kube-prometheus-0.7.0/manifests# cat prometheus/prometheus-prometheus.yaml
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
labels:
prometheus: k8s
name: k8s
namespace: monitoring
spec:
alerting:
alertmanagers:
- name: alertmanager-main
namespace: monitoring
port: web
image: quay.mirrors.ustc.edu.cn/prometheus/prometheus:v2.22.1
storage: #新增持久化存储配置
volumeClaimTemplate:
spec:
storageClassName: prometheus-data-db
resources:
requests:
storage: 50Gi
nodeSelector:
kubernetes.io/os: linux
podMonitorNamespaceSelector: {}
podMonitorSelector: {}
probeNamespaceSelector: {}
probeSelector: {}
replicas: 2
resources:
requests:
memory: 400Mi
ruleSelector:
matchLabels:
prometheus: k8s
role: alert-rules
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: prometheus-k8s
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
version: v2.22.1
接下来再对grafana的配置进行持久化修改
vim grafana/grafana-deployment.yaml
serviceAccountName: grafana
volumes:
- name: grafana-storage # 新增持久化配置
persistentVolumeClaim:
claimName: grafana # 设置为创建的PVC名称
# - emptyDir: {}
# name: grafana-storage
- name: grafana-datasources
secret:
secretName: grafana-datasources
还要再新增一个grafana的pvc
root@apulis:~/kube-prometheus-0.7.0/manifests# cat grafana/grafana-pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: grafana
namespace: monitoring #---指定namespace为monitoring
spec:
storageClassName: prometheus-data-db #---指定StorageClass
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
然后给这3个文件修改nodeport,方便访问
root@apulis:~/kube-prometheus-0.7.0/manifests# grep -rniC 2 "nodeport"
alertmanager/alertmanager-service.yaml-7- namespace: monitoring
alertmanager/alertmanager-service.yaml-8-spec:
alertmanager/alertmanager-service.yaml:9: type: NodePort
alertmanager/alertmanager-service.yaml-10- ports:
alertmanager/alertmanager-service.yaml-11- - name: web
--
prometheus/prometheus-service.yaml-7- namespace: monitoring
prometheus/prometheus-service.yaml-8-spec:
prometheus/prometheus-service.yaml:9: type: NodePort
prometheus/prometheus-service.yaml-10- ports:
prometheus/prometheus-service.yaml-11- - name: web
--
grafana/grafana-service.yaml-7- namespace: monitoring
grafana/grafana-service.yaml-8-spec:
grafana/grafana-service.yaml:9: type: NodePort
grafana/grafana-service.yaml-10- ports:
grafana/grafana-service.yaml-11- - name: http
我这里为了演示省事都用了prometheus的sc,大家可以根据实际情况建sc。 最后根据这个顺序执行部署
root@apulis:~/kube-prometheus-0.7.0/manifests# cat apply.sh
#!/bin/bash
kubectl apply -f setup/
kubectl apply -f storage/
kubectl apply -f operator/
kubectl apply -f adapter/
kubectl apply -f alertmanager/
kubectl apply -f grafana/
kubectl apply -f kube-state-metrics/
kubectl apply -f node-exporter/
kubectl apply -f prometheus
kubectl apply -f serviceMonitor
查看状态
root@apulis:~/kube-prometheus-0.7.0/manifests# kubectl get pod,svc -n monitoring
NAME READY STATUS RESTARTS AGE
pod/alertmanager-main-0 2/2 Running 0 100m
pod/alertmanager-main-1 2/2 Running 0 100m
pod/alertmanager-main-2 2/2 Running 0 100m
pod/grafana-5d65487c86-vrtzv 1/1 Running 0 62m
pod/kube-state-metrics-58c88f48b7-wnwhh 3/3 Running 0 100m
pod/node-exporter-kksm7 2/2 Running 0 100m
pod/prometheus-adapter-69b8496df6-5mgc9 1/1 Running 0 100m
pod/prometheus-k8s-0 2/2 Running 1 100m
pod/prometheus-k8s-1 2/2 Running 1 100m
pod/prometheus-nfs-provisioner-57b9b9b9b5-mdffk 1/1 Running 0 96m
pod/prometheus-operator-7649c7454f-tq585 2/2 Running 0 100m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-main NodePort 10.169.9.15 <none> 9093:37981/TCP 100m
service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 100m
service/grafana NodePort 10.169.66.222 <none> 3000:20026/TCP 100m
service/kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 100m
service/node-exporter ClusterIP None <none> 9100/TCP 100m
service/prometheus-adapter ClusterIP 10.169.34.158 <none> 443/TCP 100m
service/prometheus-k8s NodePort 10.169.171.92 <none> 9090:21637/TCP 100m
service/prometheus-operated ClusterIP None <none> 9090/TCP 100m
service/prometheus-operator ClusterIP None <none> 8443/TCP 100m
值得一提的是,部署到这一步的时候,会发现只能默认监控集群内的数据,但是对很多公司而言,还有许多零零散散的机器需要监控,而这些机器并没有加入K8S集群的需求。所以针对这一情况我们要对node-exporter做出调整。 首先给需要监控的机器部署node-exporter。
https://github.com/prometheus/node_exporter/releases/download/v1.6.0/node_exporter-1.6.0.linux-amd64.tar.gz
做成服务启动
vim /etc/systemd/system/node_export9110.service
[Unit]
Description=Prometheus node_exporter
Requires=network.target remote-fs.target
After=network.target remote-fs.target
[Service]
Type=simple
User=root
Group=root
ExecStart=/usr/local/node_exporter-1.6.0.linux-amd64/node_exporter --web.listen-address=0.0.0.0:9110
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
RestartSec=5s
[Install]
WantedBy=multi-user.target
systemctl enable node_export9110.service systemctl start node_export9110.service
root@apulis:~/ops_scripts-monitor_scripts/monitor# netstat -tnlp | grep 9110
tcp6 0 0 :::9110 :::* LISTEN 4059824/node_export
接下来我们要修改3个文件,分别是node-exporter文件夹里面的node-exporter-serviceMonitor.yaml node-exporter-service.yaml以及endpoint文件,endpoint并不是自带的,我们导出一份进行修改。 下面是3分文件的内容 node-exporter-service.yaml把https改为http,9100改为9110,这里要注意的是如果监控的是集群外的其他集群,那么端口号是9100,我这里是单机部署的node_exporter,那么就是9110.
root@apulis:~/kube-prometheus-0.7.0/manifests/node-exporter-bak# cat node-exporter-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/name: node-exporter
app.kubernetes.io/version: v1.0.1
name: node-exporter
namespace: monitoring
spec:
clusterIP: None
ports:
- name: http
port: 9110
targetPort: http
selector:
app.kubernetes.io/name: node-exporter
把node-exporter-serviceMonitor.yaml的https改为http
root@apulis:~/kube-prometheus-0.7.0/manifests/node-exporter-bak# cat node-exporter-serviceMonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app.kubernetes.io/name: node-exporter
app.kubernetes.io/version: v1.0.1
name: node-exporter
namespace: monitoring
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
interval: 15s
port: http
relabelings:
- action: replace
regex: (.*)
replacement: $1
sourceLabels:
- __meta_kubernetes_pod_node_name
targetLabel: instance
scheme: http
tlsConfig:
insecureSkipVerify: true
jobLabel: app.kubernetes.io/name
selector:
matchLabels:
app.kubernetes.io/name: node-exporter
然后,导出一份endpint进行修改。
kubectl get endpoints -n monitoring node-exporter -o yaml > node-exporter-endpoint.yaml
apiVersion: v1
kind: Endpoints
metadata:
labels:
app.kubernetes.io/name: node-exporter
app.kubernetes.io/version: v1.0.1
service.kubernetes.io/headless: ""
name: node-exporter
namespace: monitoring
subsets:
- addresses:
- ip: 192.168.4.77
nodeName: 192.168.4.77
ports:
- name: http
port: 9110
protocol: TCP
- addresses:
- ip: 192.168.5.88
nodeName: 192.168.5.88
ports:
- name: http
port: 9110
protocol: TCP
kubectl apply -f node-exporter-endpoint.yaml
root@apulis:~/kube-prometheus-0.7.0/manifests/node-exporter-bak# kubectl get endpoints -n monitoring
NAME ENDPOINTS AGE
alertmanager-main 10.241.236.219:9093,10.241.236.220:9093,10.241.236.221:9093 26h
alertmanager-operated 10.241.236.219:9094,10.241.236.220:9094,10.241.236.221:9094 + 6 more... 26h
grafana 10.241.236.226:3000 26h
k8s.prometheus-nfs <none> 26h
kube-state-metrics 10.241.236.218:8443,10.241.236.218:9443 26h
node-exporter 192.168.4.77:9110,192.168.5.88:9110 9m24s
prometheus-adapter 10.241.236.216:6443 26h
prometheus-k8s 10.241.236.224:9090,10.241.236.225:9090 26h
prometheus-operated 10.241.236.224:9090,10.241.236.225:9090 26h
prometheus-operator 10.241.236.215:8443 26h
prometheus-webhook-apulis-svc 10.241.236.227:5000,10.241.236.228:5000 20h
应用后可以看到4.77和 5.88已经被纳管了。