Prometheus学习(一):单节点部署与配置+Grafana安装
- 一、配置前准备
- 1、下载软件
- 2、主机列表
- 3、同步时间
- 4、创建账户
- 二、安装与配置
- 1、node_exporter安装与配置
- 2、Server安装与配置
- 3、启动服务
- (1)node_exporter
- (2)Server
- 4、打开web检查是否成功运行
- 三、Grafana
- 1、安装Grafana
- 2、配置Grafana数据源
- (1)进入Grafana控制台
- (2)新增数据源
- (3)选择Prometheus
- (4)填写配置
- 3、配置仪表板
- (1)回到首页,点击Create创建仪表板
- (2)添加空面板
- (3)配置监控信息
- (4)保存仪表板,这里我命名为System Monitor
- 4、验证
一、配置前准备
1、下载软件
#官网:https://prometheus.io/
#server
https://github.com/prometheus/prometheus/releases/download/v2.25.2/prometheus-2.25.2.linux-amd64.tar.gz
https://github.com/prometheus/node_exporter/releases/download/v1.1.2/node_exporter-1.1.2.linux-amd64.tar.gz
#node_exporter
https://github.com/prometheus/node_exporter/releases/download/v1.1.2/node_exporter-1.1.2.linux-amd64.tar.gz
2、主机列表
#这里根据实际情况调整node数量即可
#server
10.0.1.17(系统版本及内核版本:Ubuntu 18.04.4 LTS (GNU/Linux 4.15.0-118-generic x86_64))
#node_exporter
10.0.1.18(系统版本及内核版本:Ubuntu 18.04.4 LTS (GNU/Linux 4.15.0-118-generic x86_64))
10.0.6.15(系统版本及内核版本:CentOS Linux release 7.6.1810 (Core) ,Linux version 3.10.0-1127.19.1.el7.x86_64)
10.0.6.7(系统版本及内核版本:CentOS Linux release 7.6.1810 (Core) ,Linux version 3.10.0-1127.19.1.el7.x86_64)
10.0.6.16(系统版本及内核版本:CentOS Linux release 7.9.2009 (Core),Linux version 3.10.0-1127.19.1.el7.x86_64)
10.0.6.2(系统版本及内核版本:CentOS Linux release 8.0.1905 (Core) ,Linux version 4.18.0-80.el8.x86_64)
3、同步时间
prometheus对时间要求高,所有实例的时间都需要强一致,这里以centos7为例,ubuntu系统配置方法请参考百度
#逻辑:ntpdate同步时间,设置系统时区为东八区
#检查
timedatectl
hwclock --show
#同步地区
timedatectl set-timezone Asia/Shanghai
#安装ntpdate
yum -y install ntpdate
#设置
ntpdate cn.pool.ntp.org
hwclock --systohc
#检查
timedatectl
#定时任务
cat >> /etc/crontab <<EOF
* */1 * * * ntpdate cn.pool.ntp.org
EOF
4、创建账户
所有实例都需要配置一个账户
sudo groupadd prometheus
sudo useradd -g prometheus -m -d /var/lib/prometheus -s /sbin/nologin prometheus
二、安装与配置
1、node_exporter安装与配置
#解压、授权
sudo tar xf node_exporter-1.1.2.linux-amd64.tar.gz -C /usr/local/
cd /usr/local/
sudo mv node_exporter-1.1.2.linux-amd64/ node_exporter
sudo chown -R prometheus.prometheus node_exporter/
#创建启动脚本
cd /usr/lib/systemd/system/
sudo touch node_exporter.service
sudo vim node_exporter.service
[Unit]
Description=node_export
Documentation=https://github.com/prometheus/node_exporter
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/node_exporter/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target
2、Server安装与配置
#解压、授权
sudo tar xf prometheus-2.4.2.linux-amd64.tar.gz -C /usr/local/
sudo cd /usr/local/
sudo mv prometheus-2.4.2.linux-amd64/ prometheus
sudo chown -R prometheus.prometheus prometheus/
#修改监听规则
cd /usr/local/prometheus
sudo vim prometheus.yml
# my global config
global:
scrape_interval: 5s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090','localhost:9100'] # 对本机node_exporter 监控
# 新添加的对其它node节点抓取数据
- job_name: 'prometheus-node'
#重写了全局抓取间隔时间,由15秒重写成5秒。
scrape_interval: 5s
static_configs:
- targets: ['10.0.1.18:9100', '10.0.6.15:9100', '10.0.6.7:9100', '10.0.6.16:9100', '10.0.6.2:9100']
#创建启动脚本
sudo vim /usr/lib/systemd/system/prometheus.service
[Unit]
Description=prometheus
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prometheus --storage.tsdb.retention=15d --log.level=debug
Restart=on-failure
[Install]
WantedBy=multi-user.target
3、启动服务
(1)node_exporter
sudo systemctl enable node_exporter.service
sudo systemctl start node_exporter.service
sudo systemctl status node_exporter.service
ss -tnl | grep 9100
(2)Server
sudo systemctl enable prometheus.service
sudo systemctl start prometheus.service
sudo systemctl status prometheus.service
4、打开web检查是否成功运行
浏览器打开http://server_ip:9090/
三、Grafana
1、安装Grafana
在Server上安装
sudo apt-get install -y adduser libfontconfig1
wget https://dl.grafana.com/oss/release/grafana_7.5.1_amd64.deb
sudo dpkg -i grafana_7.5.1_amd64.deb
验证:浏览器打开http://server_ip:3000/
2、配置Grafana数据源
(1)进入Grafana控制台
(2)新增数据源
(3)选择Prometheus
(4)填写配置
设置好URL后,点击下方的Save & Test
3、配置仪表板
(1)回到首页,点击Create创建仪表板
(2)添加空面板
(3)配置监控信息
#上图中的命令如下,输入后按Shift+Enter可以预览监控,如果没有显示,可以尝试把10s调整为1m看看
sum by (instance)(increase(node_cpu_seconds_total{mode="idle"}[10s])) / sum by (instance)(increase(node_cpu_seconds_total[10s]))
(4)保存仪表板,这里我命名为System Monitor
4、验证
(1)回到主页,查看是否有个叫”System Monitor“的仪表盘
(2)打开View,查看监控信息