Prometheus、Grafana 容器化部署
环境说明
主机名 | IP |
master | 192.168.58.110 |
client | 192.168.58.20 |
在master主机上安装docker
docker安装
配置网络源(rhel红帽系统)
[root@master ~]# curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-8.repo
配置docker-ce 源
[root@master ~]# cd /etc/yum.repos.d/
[root@master yum.repos.d]# curl -o docker-ce.repo https://mirrors.tuna.tsinghua.edu.cn/docker-ce/linux/centos/docker-ce.repo
安装 docker-ce 以及依赖包和工具
[root@master ~]# dnf -y install yum-utils device-mapper-persistent-data lvm2
[root@master ~]# yum -y install docker-ce --allowerasing
安装完成后,使用 docker version 命令查看docker的版本信息
[root@master ~]# docker version
Client: Docker Engine - Community
Version: 20.10.11
API version: 1.41
Go version: go1.16.9
Git commit: dea9396
Built: Thu Nov 18 00:36:58 2021
OS/Arch: linux/amd64
Context: default
Experimental: true
配置docker镜像 加速
个人加速器链接获取 请访问 docker 基础用法
[root@master ~]# mkdir -p /etc/docker
[root@master ~]# vim /etc/docker/daemon.json
{
"registry-mirrors": ["https://a74l47xi.mirror.aliyuncs.com"] //此处的网址是个人账户分配的
}
[root@master ~]# systemctl daemon-reload
[root@master ~]# systemctl restart docker
配置完成后拉取prom/Prometheus官方镜像
[root@master ~]# docker pull prom/prometheus
Using default tag: latest
latest: Pulling from prom/prometheus
97518928ae5f: Pull complete
5b58818b7f48: Pull complete
d9a64d9fd162: Pull complete
4e368e1b924c: Pull complete
867f7fdd92d9: Pull complete
387c55415012: Pull complete
07f94c8f51cd: Pull complete
ce8cf00ff6aa: Pull complete
e44858b5f948: Pull complete
4000fdbdd2a3: Pull complete
Digest: sha256:18d94ae734accd66bccf22daed7bdb20c6b99aa0f2c687eea3ce4275fe275062
Status: Downloaded newer image for grafana/grafana:latest
docker.io/prom/prometheus:latest
[root@master ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
prom/prometheus latest a3d385fc29f9 11 days ago 201MB
在client上获取prometheus.yml配置文件
prometheus官方
# 将prometheus的安装包上传至主机中,解压,将prometheus.yaml配置文件传输到master主机的/opt目录中
[root@client ~]# ls
anaconda-ks.cfg prometheus-2.31.1.linux-amd64.tar.gz
[root@client ~]# tar xf prometheus-2.31.1.linux-amd64.tar.gz
[root@client ~]# cd prometheus-2.31.1
[root@client prometheus-2.31.1]# scp /root/prometheus-2.31.1/prometheus.yml 192.168.58.110:/opt/prometheus.yml
root@192.168.58.110's password:
prometheus.yml 100% 934 29.3KB/s 00:00
使用官方镜像运行prometheus 容器,并进行端口和目录文件映射
# 查看配置文件
[root@master ~]# cat /opt/prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
# 查看镜像
[root@master ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
prom/prometheus latest a3d385fc29f9 11 days ago 201MB
# 映射端口和配置文件到主机上且设置随docker启动而启动容器
[root@master opt]# docker run -d --name prometheus --restart always -p 9090:9090 -v /opt/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus
cb748d375af075241ea835c14a00896a8d94a3e05f911f8b88c155be9ae35980
[root@master opt]# docker ps | grep prometheus
cb748d375af0 prom/prometheus "/bin/prometheus --c…" 7 seconds ago Up 7 seconds 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp prometheus
# 查看容器运行状态
[root@master ~]# docker ps | grep prometheus
933b88601ed6 prom/prometheus "/bin/prometheus --c…" 10 minutes ago Up 10 minutes 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp prometheus
网页访问(ip+prot)
查看此时prometheus 监控的对象
如何去监控其他主机(节点)呢?
Prometheus可以从Kubernetes集群的各个组件中采集数据,比如kubelet中自带的cadvisor,api-server等,而node-export就是其中一种来源
Exporter是Prometheus的一类数据采集组件的总称。它负责从目标处搜集数据,并将其转化为Prometheus支持的格式。与传统的数据采集组件不同的是,它并不向中央服务器发送数据,而是等待中央服务器主动前来抓取,默认的抓取地址为http://CURRENT_IP:9100/metrics
node-exporter用于采集服务器层面的运行指标,包括机器的loadavg、filesystem、meminfo等基础监控,类似于传统主机监控维度的zabbix-agent
使用node-exporter去采集信息,最后再将信息传给Prometheus,从而实现不同节点监控。
在client主机上部署 node-exporter
prometheus官方
将安装包传入client主机中,解压后,重命名
[root@client ~]# ls
anaconda-ks.cfg node_exporter-1.3.0.linux-amd64.tar.gz
[root@client ~]# tar xf node_exporter-1.3.0.linux-amd64.tar.gz -C /usr/local/
[root@client ~]# cd /usr/local/
[root@client local]# ls
bin etc games include lib lib64 libexec node_exporter-1.3.0.linux-amd64 prometheus sbin share src
[root@client local]# mv node_exporter-1.3.0.linux-amd64/ node_exporter
[root@client local]# ls
bin etc games include lib lib64 libexec node_exporter prometheus sbin share src
配置service文件
[root@client ~]# vim /usr/lib/systemd/system/node_exporter.service
[unit]
Description=The node_exporter Server
After=network.target
[Service]
ExecStart=/usr/local/node_exporter/node_exporter
Restart=on-failure
RestartSec=15s
SyslogIdentifier=node_exporter
[Install]
WantedBy=multi-user.target
# 设置自启node_exporter
[root@client local]# systemctl daemon-reload && systemctl enable node_exporter && systemctl restart node_exporter
Created symlink /etc/systemd/system/multi-user.target.wants/node_exporter.service → /usr/lib/systemd/system/node_exporter.service.
查看端口(默认9100端口)
[root@client ~]# ss -anlt
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 128 [::]:22 [::]:*
LISTEN 0 128 *:9100 *:*
在master 主机上修改prometheus.yaml配置文件,添加节点
[root@master ~]# vi /opt/prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- job_name: "Linux Server" //添加此处
static_configs: //添加此处
- targets: ["192.168.58.20:9100"] //添加此处,将node_exporter所在的宿主机ip+9100
重启容器
[root@master ~]# systemctl restart docker
[root@master ~]# docker ps | grep prometheus
cb748d375af0 prom/prometheus "/bin/prometheus --c…" 3 minutes ago Up 3 seconds 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp prometheus
再次访问,发现新的节点信息
使用Grafan对监控的节点信息进行可视化
Grafana 容器部署
拉取grafan/grafan官方镜像
[root@master ~]# docker pull grafana/grafana
Using default tag: latest
latest: Pulling from grafana/grafana
97518928ae5f: Pull complete
5b58818b7f48: Pull complete
d9a64d9fd162: Pull complete
4e368e1b924c: Pull complete
867f7fdd92d9: Pull complete
387c55415012: Pull complete
07f94c8f51cd: Pull complete
ce8cf00ff6aa: Pull complete
e44858b5f948: Pull complete
4000fdbdd2a3: Pull complete
Digest: sha256:18d94ae734accd66bccf22daed7bdb20c6b99aa0f2c687eea3ce4275fe275062
Status: Downloaded newer image for grafana/grafana:latest
docker.io/grafana/grafana:latest
使用镜像运行grafana容器,并映射端口提供服务
[root@master ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
prom/prometheus latest a3d385fc29f9 11 days ago 201MB
grafana/grafana latest 9b957e098315 2 weeks ago 275MB
[root@master ~]# docker run -dit --name grafan -p 3000:3000 grafana/grafana
2a068867c04d57aa67ece4d35f28e2a77f188c248de6a43bc071a9bb21aae417
[root@master ~]# docker ps | grep grafan
2a068867c04d grafana/grafana "/run.sh" 11 seconds ago Up 8 seconds 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp grafan
访问主页(ip+3000端口号)
第一次登入,需要修改密码
修改密码后,进入首页
添加prometheus 数据源(就是prometheus的访问地址)
填写过后,向下划,点击保存并测试
保存成功,后导入图表
选择数据源
自定义grafana操作 请阅读 自定义图表