关于zabbix server出现的告警情况解释及处理方法
一、More than 100 items having missing data for more than 10 minutes
中文解释:超过100项数据丢失超过10分钟
原因分析:
1.server端与proxy端时间不同步
2.server端分配的缓存不够
3.server端分配的线程不够
4.server端负载比较大{CPU,IO,MEM}
解决办法:1、增大线程 2、增大缓存 具体修改参数如下
[root@localhost zabbix]# vim /usr/local/zabbix/etc/zabbix_server.conf
StartPollers=500 # zabbix server 的进程数
StartPollersUnreachable=50
StartTrappers=30
StartDiscoverers=6 # 自动发现数量
CacheSize=1G # 缓存
CacheUpdateFrequency=300 # 更新频率(单位s)
StartDBSyncers=20 # 预先foke DB Syncers的数量
HistoryCacheSize=512M #历史记录缓存大小,用于存储历史记录
TrendCacheSize=256M # 历史数据缓存大小
HistoryTextCacheSize=80M # 取值范围:128K-2G 文本类型历史记录的缓存大小,存储character, text 、log历史记录.
ValueCacheSize=1G # 0表示禁用,history value缓存大小,当缓存超标了,将会每隔5分钟往server日志里面记录。
二、Too many processes on 10.9.19.217(zabbix server)
中文解释:10.9.19.217(zabbix服务器)上的进程太多
可能有以下两种情况:
1、检查此服务器上进程是否有些无用进程,关闭即可
2、当你在此服务器上启动的进程确实很多,此时需要去调高触发器的值,如下图
三、Zabbix housekeeper processes more than 75% busy。
housekeeper是什么呢,我们从配置文件(如下)来研究
### Option: HousekeepingFrequency
# How often Zabbix will perform housekeeping procedure (in hours).
# Housekeeping is removing outdated information from the database.
# To prevent Housekeeper from being overloaded, no more than 4 times HousekeepingFrequency
# hours of outdated information are deleted in one housekeeping cycle, for each item.
# To lower load on server startup housekeeping is postponed for 30 minutes after server start.
# With HousekeepingFrequency=0 the housekeeper can be only executed using the runtime control option.
# In this case the period of outdated information deleted in one housekeeping cycle is 4 times the
# period since the last housekeeping cycle, but not less than 4 hours and not greater than 4 days.
#
# Mandatory: no
# Range: 0-24
# Default:
HousekeepingFrequency=12
### Option: MaxHousekeeperDelete
# The table "housekeeper" contains "tasks" for housekeeping procedure in the format:
# [housekeeperid], [tablename], [field], [value].
# No more than 'MaxHousekeeperDelete' rows (corresponding to [tablename], [field], [value])
# will be deleted per one task in one housekeeping cycle.
# SQLite3 does not use this parameter, deletes all corresponding rows without a limit.
# If set to 0 then no limit is used at all. In this case you must know what you are doing!
#
# Mandatory: no
# Range: 0-1000000
# Default:
MaxHousekeeperDelete=1000000
翻译过来大概就是说从数据库中删除过期的历史数据。然后HousekeepingFrequency是清理的频率,这里我设置为12小时清理一次,MaxHousekeeperDelete就是一个阈值,每次轮到删除过期历史数据这个任务的时候,最多删除这个阈值的行数。其实
就是对mysql进行删除数据操作。
四、Zabbix poller processes more than 75% busy
中文解释:zabbix轮询的进程超过75%繁忙
告警原因:
1.某个进程卡住了,
2.僵尸进程出错,太多,导致慢了
3.网络延迟(可忽略)
4.zabbix消耗的内存多了
解决办法:
修改zabbix-server.conf文件中的StartPollers(zabbix server 的进程数)参数,