现在我们来创建 prometheus 的 Pod 资源:

# prometheus-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
namespace: kube-mon
labels:
app: prometheus
spec:
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
spec:
serviceAccountName: prometheus
containers:
- image: prom/prometheus:v2.24.1
name: prometheus
args:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus" # 指定tsdb数据路径
- "--storage.tsdb.retention.time=24h"
- "--web.enable-admin-api" # 控制对admin HTTP API的访问,其中包括删除时间序列等功能
- "--web.enable-lifecycle" # 支持热更新,直接执行localhost:9090/-/reload立即生效
ports:
- containerPort: 9090
name: http
volumeMounts:
- mountPath: "/etc/prometheus"
name: config-volume
- mountPath: "/prometheus"
name: data
resources:
requests:
cpu: 100m
memory: 512Mi
limits:
cpu: 100m
memory: 512Mi
volumes:
- name: data
persistentVolumeClaim:
claimName: prometheus-data
- configMap:
name: prometheus-config
name: config-volume

另外为了 prometheus 的性能和数据持久化我们这里是直接将通过一个 LocalPV 来进行数据持久化的,通过 ​​--storage.tsdb.path=/prometheus​​ 指定数据目录,创建如下所示的一个 PVC 资源对象,注意是一个 LocalPV,和 node3 节点具有亲和性:

apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-local
labels:
app: prometheus
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 20Gi
storageClassName: local-storage
local:
path: /data/k8s/prometheus
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- node3
persistentVolumeReclaimPolicy: Retain
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: prometheus-data
namespace: kube-mon
spec:
selector:
matchLabels:
app: prometheus
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: local-storage

现在我们就可以添加 promethues 的资源对象了:

$ kubectl apply -f prometheus-deploy.yaml 
deployment.apps/prometheus created
$ kubectl get pods -n kube-mon
NAME READY STATUS RESTARTS AGE
prometheus-df4f47d95-vksmc 0/1 CrashLoopBackOff 3 98s
$ kubectl logs -f prometheus-df4f47d95-vksmc -n kube-mon
level=info ts=2019-12-12T03:08:49.424Z caller=main.go:332 msg="Starting Prometheus" version="(version=2.14.0, branch=HEAD, revision=edeb7a44cbf745f1d8be4ea6f215e79e651bfe19)"
level=info ts=2019-12-12T03:08:49.424Z caller=main.go:333 build_context="(go=go1.13.4, user=root@df2327081015, date=20191111-14:27:12)"
level=info ts=2019-12-12T03:08:49.425Z caller=main.go:334 host_details="(Linux 3.10.0-1062.4.1.el7.x86_64 #1 SMP Fri Oct 18 17:15:30 UTC 2019 x86_64 prometheus-df4f47d95-vksmc (none))"
level=info ts=2019-12-12T03:08:49.425Z caller=main.go:335 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2019-12-12T03:08:49.425Z caller=main.go:336 vm_limits="(soft=unlimited, hard=unlimited)"
level=error ts=2019-12-12T03:08:49.425Z caller=query_logger.go:85 component=activeQueryTracker msg="Error opening query log file" file=/prometheus/queries.active err="open /prometheus/queries.active: permission denied"
panic: Unable to create mmap-ed active query log

goroutine 1 [running]:
github.com/prometheus/prometheus/promql.NewActiveQueryTracker(0x7ffd8cf6ec5d, 0xb, 0x14, 0x2b4f400, 0xc0006f33b0, 0x2b4f400)
/app/promql/query_logger.go:115 +0x48c
main.main()
/app/cmd/prometheus/main.go:364 +0x5229

创建 Pod 后,我们可以看到并没有成功运行,出现了 ​​open /prometheus/queries.active: permission denied​​​ 这样的错误信息,这是因为我们的 prometheus 的镜像中是使用的 nobody 这个用户,然后现在我们通过 LocalPV 挂载到宿主机上面的目录的 ​​ownership​​​ 却是 ​​root​​: 

$ ls -la /data/k8s
total 36
drwxr-xr-x 6 root root 4096 Dec 12 11:07 .
dr-xr-xr-x. 19 root root 4096 Nov 9 23:19 ..
drwxr-xr-x 2 root root 4096 Dec 12 11:07 prometheus
[root@master persistentvolume]# kubectl exec -it prometheus-server-5775f99578-5gm5f -n monitor  sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/prometheus $ id
uid=65534(nobody) gid=65534(nogroup)

/prometheus $ ls -la /prometheus/
total 16
drwxr-xr-x 3 nobody nogroup 4096 Jan 17 15:31 .
drwxr-xr-x 1 root root 4096 Jan 17 15:31 ..
-rw------- 1 nobody nogroup 2 Jan 17 15:31 lock
drwxr-xr-x 2 nobody nogroup 4096 Jan 17 15:31 wal

所以当然会出现操作权限问题了,这个时候我们就可以通过 ​​securityContext​​​ 来为 Pod 设置下 volumes 的权限,通过设置 ​​runAsUser=0​​ 指定运行的用户为 root,也可以通过设置一个 initContainer 来修改数据目录权限:

......
initContainers:
- name: fix-permissions
image: busybox
command: [chown, -R, "nobody:nobody", /prometheus]
volumeMounts:
- name: data
mountPath: /prometheus

这个时候我们重新更新下 prometheus:

$ kubectl apply -f prometheus-deploy.yaml
deployment.apps/prometheus configured
$ kubectl get pods -n kube-mon
NAME READY STATUS RESTARTS AGE
prometheus-79b8774f68-7m8zr 1/1 Running 0 56s
$ kubectl logs -f prometheus-79b8774f68-7m8zr -n kube-mon
level=info ts=2019-12-12T03:17:44.228Z caller=main.go:332 msg="Starting Prometheus" version="(version=2.14.0, branch=HEAD, revision=edeb7a44cbf745f1d8be4ea6f215e79e651bfe19)"
......
level=info ts=2019-12-12T03:17:44.822Z caller=main.go:673 msg="TSDB started"
level=info ts=2019-12-12T03:17:44.822Z caller=main.go:743 msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
level=info ts=2019-12-12T03:17:44.827Z caller=main.go:771 msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml
level=info ts=2019-12-12T03:17:44.827Z caller=main.go:626 msg="Server is ready to receive web requests."