prometheus 删除Organizations 关闭prometheus

转载

温柔一刀 2024-04-22 10:37:48

文章标签 prometheus 加载数据 linux 文章分类 云原生云计算

Prometheus 介绍

功能介绍

Prometheus安装

安装介绍

prometheus.yml 文件介绍

prometheus实施安装

Prometheus常用命令参数有哪些

设置Prometheus-server开机自动启动（解释）

Prometheus简单启动页面介绍

node_exporte 是做什么的

安装node_exporte

启用/禁用node_exporte，相关参数监控信息

编辑用Prometheus重新加载node_exporter

远程监控机器过程

基于文件抓取的动态服务发现

consul手动安装

consul 在Prometheus中的作用是什么

白盒监控和黑河监控的区别

Promehteus发现待监控Targets:

Prometheus 介绍

Prometheus是一个开源的系统监控和警报工具。它是一个功能强大的多维度时间序列数据模型，可以收集服务器、容器、应用程序等各种不同类型的监控数据。在收集数据后，Prometheus的查询语言PromQL可以用于提取和聚合指标数据，以帮助用户深入了解系统的状态和性能，并及早发现潜在的问题。

功能介绍

监控系统：Prometheus

Prometheus Server 核心组件， Prometheus的基本工功能他都包括
Scraper: HTTP call 内部的数据抓取

必须满足三个条件

支持 Prometheus的指标格式
得自己自行暴漏指标，如果不自己暴露指标就需要用额外的应用来暴露
暴露的接口得通过HTTP实现，要想抓取都得通过HTTP call来实现的

TSDB 内置的时间序列数据，用来存储抓取的数据
Web UI 内置的查询浏览器
Alert Rule 内置的告警规则，生成告警信息发送给 AlertManager

接收者常见接收方式：邮件、短信等。
AlertManager 内置的AlertManager才是真正告警的
NodeExporter 为了监控节点，提供的专门暴漏节点的指标
监控应用：要不使用自带的，要不使用额外的收集特定应用程序的指标信息。

自带Instrumentition
额外部署专用的Exporter

Prometheus安装

安装介绍

入门学习使用Prometheus：
(1) 部署Prometheus Server
(2) 将Prometheus Server自身纳入监控体系
自带测量系统安装
(3) 将Prometheus Server自身所在节点纳入监控体系
(a) 额外部署专用的Exporter 监听于9100, 暴露指标的路径/metrics
(b) 配置Prometheus Server发现并监控该Exporter，也可以静态配置

学习使用Prometheus:
PromQL

生产环境中

持久化高可用
Prmetheus高可用
监控系统高可用
多级监控 Prometheus宕了怎么办？建议Prometheus监控Prometheus为各组件提升可用性；

prometheus.yml 文件介绍

# my global config
global:                   #全局设置
  scrape_interval: 15s    #监控设置全局抓取时间，默认一分钟，如果scrape_configs不做指定抓取设置的话，以这里为准
  evaluation_interval: 15s #设置全局规则文件读取时间，和上面一样rule_files不指定
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration   #接收来自 Prometheus 的告警数据
alerting:                      #alerting 是告警设置的开始标记
  alertmanagers:               #alertmanagers 表示 Alertmanager（报警管理程序）的设置
    - static_configs:#静态配置的部分，表示我们在配置中直接指定了 Alertmanager 的URL或者 IP，
                      而不是通过配置管理工具或者服务发现机制动态获取 Alertmanager 的地址
        - targets:             #targets 是一个数组，用于指定 Alertmanager 的 URL 或者 IP
          # - alertmanager:9093  案例

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:                #以抓取样板数据做周期的语句运行，运行结果保存在一个新的时间序列，
  # - "first_rules.yml"    #不指定多少时间运行，就按默认规则时间 evaluation_interval
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:            #具体服务的抓取配置
  - job_name: "prometheus" job可以把一组相同功能的相似功能的接口指标类型定义成一个job统一抓取
  
  # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
 
   static_configs:                     #只是静态指定地址 
      - targets: ["localhost:9090"]    #targets定义出多个监控对象来 

#因为他是web服务可能是基于某个路径才暴露之指标的，默认路径/metrics，可以用metrics_path指定
#同时他也有协议默认的协议是http，可以用scheme指定协议有两种（http，https） 
~

实施定义
- job_name: "prometheus" # 每组相似应用，定义一个Job
metrics_path: /metrics
scheme: http

static_configs:
- targets: ["localhost:9090"]

抓取指标的路径：
http://localhost:9090/metrics

prometheus实施安装

tar xf prometheus-2.44.0.linux-amd64.tar.gz -C /usr/local/
ln -sv prometheus-2.44.0.linux-amd64.tar.gz prometheus #至于为什么使用符号链接是因为方方便后续升级的，以防止后续升级不规范
cd /usr/local/ #在生产建议把目录不放在安装目录，放在一个IO能力较强的适合长期存储的目录
ln -sv prometheus-2.44.0.linux-amd64/ prometheus
cd prometheus #进去
cp prometheus.yml {,.bak} # 备份一下万一出问题

./prometheus --config.file=./prometheus.yml 运行监控自己（端口是默认端口所以不用指）

Prometheus启动命令包括三部分，具体如下：

1 ./prometheus: 启动Prometheus的可执行文件。

2 --config.file=./prometheus.yml: 指定Prometheus的配置文件，通常是一个YAML格式的文件，其中包含有关如何收集、处理、存储和展示指标数据的配置信息。

3 --web.listen-address=:9090 : 指定Prometheus侦听的端口号和IP地址。在这个例子中，Prometheus将侦听所有可用的IP地址，并使用9090端口来提供Web界面和API服务。

因此，完整的Prometheus启动命令如下：

./prometheus --config.file=./prometheus.yml --web.listen-address=:9090
执行此命令后，Prometheus将开始侦听9090端口，并启动Web界面和API服务，用于显示指标数据、设置报警规则、查询数据等

Prometheus常用命令参数有哪些

1 Prometheus查看版本号 ./prometheus --version 
2 监听端口号：通过命令行参数指定，示例： --web.listen-address=:9090 。
3 指定配置文件路径：通过命令行参数，示例：--config.file=./prometheus.yml。
4 持久化存储路径：在配置文件中指定，通常是指定一个磁盘文件路径，用于保存时间序列数据，示例： --storage.tsdb.path=/data/prometheus 。
5 如果在运行状态重新加载需要在启动Prometheus的时候后面加上--web.enable-lifecycle

设置Prometheus-server开机自动启动（解释）

◼ 提示

◆需要事先添加用户prometheus

◆修改ExecStart的值指向实际的程序文件位置

[Unit]
Description=Monitoring system and time series database
Documentation=https://prometheus.io/docs/introduction/overview/  #官方文档不会去里面看

[Service]
Restart=always
User=prometheus
EnvironmentFile=/etc/default/prometheus
ExecStart=/usr/bin/prometheus $ARGS
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
SendSIGKILL=no
LimitNOFILE=8192

[Install]
WantedBy=multi-user.target

解释：

Unit：描述了服务的通用信息，包括服务名称、说明文档等。
Service：定义了服务的具体行为，包括启动命令、环境变量、用户和文件权限等。
Install：定义了服务的安装位置和依赖关系，用于在系统启动时自动启动服务。
具体地说，上述配置文件指定了以下内容：

服务描述：Description=Monitoring system and time series database， 描述了服务的名字和功能。

服务重启：Restart=always，         指定了服务在退出后自动重启。

运行用户：User=prometheus，        指定了服务以 prometheus 用户身份运行。

环境变量：EnvironmentFile=/etc/default/prometheus，    指定了加载从该文件中定义的环境变量。

启动命令：ExecStart=/usr/bin/prometheus $ARGS，指定了服务启动命令，其中 $ARGS 被解析为命令行参数。

重载命令：ExecReload=/bin/kill -HUP $MAINPID，用于在执行 systemctl reload 命令时重载配置文件。

停止命令：TimeoutStopSec=20s, SendSIGKILL=no，在服务被停止时，等待20秒钟，之后发送 SIGTERM 信号来优雅地关闭进程。

文件描述符：LimitNOFILE=8192，指定服务所能打开的文件描述符的数量。

安装文件位置：WantedBy=multi-user.target，将服务作为 multi-user.target 的依赖关系，表示在系统启动后自动启动该服务。

这个配置文件的路径通常是 /etc/systemd/system/prometheus.service。执行 systemctl enable prometheus.service 命令即可将服务添加到启动列表，并使用 systemctl start prometheus.service 命令启动服务。

Prometheus简单启动页面介绍

prometheus 删除Organizations 关闭prometheus_linux

内用介绍

prometheus 删除Organizations 关闭prometheus_加载_02

图形页面

prometheus 删除Organizations 关闭prometheus_linux_03

配置上来的监控对象

prometheus 删除Organizations 关闭prometheus_prometheus_04

UP用来标识每个指标都在正常运行，这里查询所有的up是否正常运行

prometheus 删除Organizations 关闭prometheus_prometheus_05

prometheus 删除Organizations 关闭prometheus_linux

查询所有up == 0（也就是宕机）这里是没有

prometheus 删除Organizations 关闭prometheus_加载_07

因为Prometheus有他自己添加的标签，所以这里可以用标签来查找指定信息

prometheus 删除Organizations 关闭prometheus_linux_08

node_exporte 是做什么的

node_exporter是一个用于在Prometheus上运行的开源服务器监控客户端，它从系统信息中采集各种重要的指标数据(例如CPU，内存，磁盘使用量)，并将其暴露给Prometheus以帮助您对服务器资源和性能进行监控和调整。您可以使用node_exporter来收集系统级别的监控数据，并将其导入到Prometheus进行集中管理和查询。它是在Linux、macOS、Windows和许多其他操作系统上可用的。

安装node_exporte

[root@rocky8 local]#tar xf node_exporter-1.6.0.linux-amd64.tar cd /usr/local/
[root@rocky8 local]#ln -vs node_exporter-1.6.0.linux-amd64 node_exporte
[root@rocky8 local]#cd node_exporter
[root@rocky8 node_exporter]#./node_exporter

启用/禁用node_exporte，相关参数监控信息

./node_exporter --collector.name （启用） --no-collector.name （禁用）

[root@rocky8 node_exporter]#./node_exporter --collector.ntp --collector.tcpstat --no-collector.zfs

登录检查

node_exporter 向外暴露的信息，类似静态页面，刷新而数据更新 9100

prometheus 删除Organizations 关闭prometheus_数据_09

Prometheus的暴露信息 9090

用Prometheus重新加载node_exporter

把node_exporter写道Prometheus里面并抓取信息

[root@rocky8 prometheus]#vim prometheus.yml
static_configs:
- targets: ["localhost:9090"]
- job_name: "node_exporter"
metrics_path: '/metrics'
scheme: 'http'
static_configs:
- targets:
- "10.0.0.8:9100"
- "10.0.0.18:9100"

如果要重新加载需要在启动Prometheus的时候加上--web.enable-lifecycle

--web.enable-lifecycle简介

如果启用，Prometheus将提供/-/reload端点，该端点用于重新加载Prometheus配置文件，并在需要重新加载配置时使用。如果禁用，则Prometheus将不提供/-/reload端点。
启用
./prometheus --web.enable-lifecycle --config.file=prometheus.yml --web.enable-lifecycle

curl -XPOST http://localhost:9090/-/reload 重新加载配置文件

prometheus 删除Organizations 关闭prometheus_linux_11

远程监控机器过程

10.0.0.8上发过去（记得写道Prometheus.yml里面） 15秒加载一次

[root@rocky8 local]#scp -r node_exporter-1.6.0.linux-amd64 10.0.0.18:/tmp/
10.0.0.18上执行

root@rocky8 tmp]#ln -vs node_exporter-1.6.0.linux-amd64/ node_exporter

[root@rocky8 tmp]#cd node_exporter

[root@rocky8 node_exporter]#./node_exporter

prometheus 删除Organizations 关闭prometheus_加载_12

基于文件抓取的动态服务发现

vim /usr/local/prometheus/prometheus.yml 加载文件

# metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "node_exporter"
    metrics_path: '/metrics'
    scheme: 'http'
    file_sd_configs:                   # 基于文件进行服务发现
      - files:                         # 指定要加载的文件列表
          - targets/nodes-*.yml        # 文件加载支持glob通配符
        refresh_interval: 2m           # 每隔2分钟重新加载一次文件中定义的Targets，默认为5m

root@rocky8 prometheus]#vim targets/nodes-linux.yml 加载列表

- targets:
  - 10.0.0.18:9100
  - 10.0.0.8:9100
  labels:
    os: rocky

curl -XPOST http://10.0.0.8:9090/-/reload #最后重新加载一下，之后加targets就不用了

consul手动安装

consul
10.0.0.101
unzip consul_1.15.2_linux_amd64.zip -d /usr/local/
mv consul bin/
mkdir -pv /consul/data/
mkdir -pv /etc/consul
vim /etc/consul/node.json
consul agent -dev -ui -data-dir=/consul/data/ -config-dir=/etc/consul/ -client=0.0.0.0

consul 在Prometheus中的作用是什么

Consul在Prometheus中的作用是提供服务发现和服务注册功能，使Prometheus能够动态地监控正在运行的服务实例，并自动更新监控配置。

在prometheus.yml里面可以吧路径指向consul的相关路径，从而完成服务端口发现

[root@rocky8 prometheus]#vim prometheus.yml

#  - job_name: 'nodes'
#    file_sd_configs:
#    - files:                                               
#      - targets/nodes-*.yaml  
#      refresh_interval: 2m 
  consul_sd_configs:
    - server: '10.0.0.8:8500'
      tags:
        - "nodes"


curl -XPOST http://10.0.0.8:9090/-/reload  #重载

白盒监控和黑河监控的区别

白盒能够了解其内部的实际运行状态，通过对监控指标的观察能够预判可能出现的问题，从而对潜在的不确定因素进行优化。而从完整的全局监控逻辑的角度，除了大量的应用白盒监控以外，还应该添加适当的黑盒监控。黑盒监控即以用户的身份测试服务的外部可见性，常见的黑盒监控包括 HTT P探针、TCP 探针等用于检测站点或者服务的可访问性，以及访问效率等。