性能监控之Telegraf+InfluxDB+Grafana window服务器安装使用_实时监控

前言

本文主要介绍 Telegraf 在 window 上安装及监控入门

安装&部署

1.找到下载地址:https://portal.influxdata.com/downloads/性能监控之Telegraf+InfluxDB+Grafana window服务器安装使用_HTTP_022.创建目录 C:\ProgramFiles\Telegraf(如果安装在其他位置,请指定 -config 具有所需位置的参数)

3.解压软件包,将文件 telegraf.exe 和 telegraf.conf 文件放入 C:\ProgramFiles\Telegraf 性能监控之Telegraf+InfluxDB+Grafana window服务器安装使用_HTTP_034.要将服务安装 到Windows 服务管理器中,以管理员身份在 CMD 中运行以下命令。如有必要,可以用双引号将文件目录中的任何空格换行 "<file directory>"


1. C:\"Program Files"\Telegraf\telegraf.exe --service install

或者


1. C:\Program Files\Telegraf>telegraf.exe --service install

性能监控之Telegraf+InfluxDB+Grafana window服务器安装使用_实时监控_045.编辑 telegraf.conf 配置文件以满足要求。


1. ###############################################################################
2. #                                  INPUTS                                     #
3. ###############################################################################
4. 
5. # Windows Performance Counters plugin.
6. # These are the recommended method of monitoring system metrics on windows,
7. # as the regular system plugins (inputs.cpu, inputs.mem, etc.) rely on WMI,
8. # which utilize more system resources.
9. #
10. # See more configuration examples at:
11. #   https://github.com/influxdata/telegraf/tree/master/plugins/inputs/win_perf_counters
12. 
13. [[inputs.win_perf_counters]]
14.   [[inputs.win_perf_counters.object]]
15.     # Processor usage, alternative to native, reports on a per core.
16.     ObjectName = "Processor"
17.     Instances = ["*"]
18.     Counters = [
19.       "% Idle Time",
20.       "% Interrupt Time",
21.       "% Privileged Time",
22.       "% User Time",
23.       "% Processor Time",
24.       "% DPC Time",
25.     ]
26.     Measurement = "win_cpu"
27.     # Set to true to include _Total instance when querying for all (*).
28.     IncludeTotal=true
29. 
30.   [[inputs.win_perf_counters.object]]
31.     # Disk times and queues
32.     ObjectName = "LogicalDisk"
33.     Instances = ["*"]
34.     Counters = [
35.       "% Idle Time",
36.       "% Disk Time",
37.       "% Disk Read Time",
38.       "% Disk Write Time",
39.       "Current Disk Queue Length",
40.       "% Free Space",
41.       "Free Megabytes",
42.     ]
43.     Measurement = "win_disk"
44.     # Set to true to include _Total instance when querying for all (*).
45.     #IncludeTotal=false
46. 
47.   [[inputs.win_perf_counters.object]]
48.     ObjectName = "PhysicalDisk"
49.     Instances = ["*"]
50.     Counters = [
51.       "Disk Read Bytes/sec",
52.       "Disk Write Bytes/sec",
53.       "Current Disk Queue Length",
54.       "Disk Reads/sec",
55.       "Disk Writes/sec",
56.       "% Disk Time",
57.       "% Disk Read Time",
58.       "% Disk Write Time",
59.     ]
60.     Measurement = "win_diskio"
61. 
62.   [[inputs.win_perf_counters.object]]
63.     ObjectName = "Network Interface"
64.     Instances = ["*"]
65.     Counters = [
66.       "Bytes Received/sec",
67.       "Bytes Sent/sec",
68.       "Packets Received/sec",
69.       "Packets Sent/sec",
70.       "Packets Received Discarded",
71.       "Packets Outbound Discarded",
72.       "Packets Received Errors",
73.       "Packets Outbound Errors",
74.     ]
75.     Measurement = "win_net"
76. 
77.   [[inputs.win_perf_counters.object]]
78.     ObjectName = "System"
79.     Counters = [
80.       "Context Switches/sec",
81.       "System Calls/sec",
82.       "Processor Queue Length",
83.       "System Up Time",
84.     ]
85.     Instances = ["------"]
86.     Measurement = "win_system"
87.     # Set to true to include _Total instance when querying for all (*).
88.     #IncludeTotal=false
89. 
90.   [[inputs.win_perf_counters.object]]
91.     # Example query where the Instance portion must be removed to get data back,
92.     # such as from the Memory object.
93.     ObjectName = "Memory"
94.     Counters = [
95.       "Available Bytes",
96.       "Cache Faults/sec",
97.       "Demand Zero Faults/sec",
98.       "Page Faults/sec",
99.       "Pages/sec",
100.       "Transition Faults/sec",
101.       "Pool Nonpaged Bytes",
102.       "Pool Paged Bytes",
103.       "Standby Cache Reserve Bytes",
104.       "Standby Cache Normal Priority Bytes",
105.       "Standby Cache Core Bytes",
106. 
107.     ]
108.     # Use 6 x - to remove the Instance bit from the query.
109.     Instances = ["------"]
110.     Measurement = "win_mem"
111.     # Set to true to include _Total instance when querying for all (*).
112.     #IncludeTotal=false
113. 
114.   [[inputs.win_perf_counters.object]]
115.     # Example query where the Instance portion must be removed to get data back,
116.     # such as from the Paging File object.
117.     ObjectName = "Paging File"
118.     Counters = [
119.       "% Usage",
120.     ]
121.     Instances = ["_Total"]
122.     Measurement = "win_swap"

6.要验证它是否有效,请运行:


1. C:\"Program Files"\Telegraf\telegraf.exe --config C:\"Program Files"\Telegraf\telegraf.conf --test

或者


1. C:\Program Files\Telegraf>telegraf.exe --config telegraf.conf --test

要开始收集数据,请运行:


1. net start telegraf

7.其他操作 telegraf 可以通过 --service 管理自己的服务:


1. telegraf.exe --service install        #安装服务
2. telegraf.exe --service uninstall    #删除服务
3. telegraf.exe --service start        #启动服务
4. telegraf.exe --service stop            #停止服务

集成Influxdb

找到 OUTPUTS 配置项


1. ###############################################################################
2. #                                  OUTPUTS                                    #
3. ###############################################################################
4. 
5. # Configuration for sending metrics to InfluxDB
6. [[outputs.influxdb]]
7.   ## The full HTTP or UDP URL for your InfluxDB instance.
8.   ##
9.   ## Multiple URLs can be specified for a single cluster, only ONE of the
10.   ## urls will be written to each interval.
11.   # urls = ["unix:///var/run/influxdb.sock"]
12.   # urls = ["udp://127.0.0.1:8089"]
13.   urls = ["http://172.16.14.111:8086"]
14. 
15.   ## The target database for metrics; will be created as needed.
16.   database = "bigscreen"
17. 
18.   ## If true, no CREATE DATABASE queries will be sent.  Set to true when using
19.   ## Telegraf with a user without permissions to create databases or when the
20.   ## database already exists.
21.   # skip_database_creation = false
22. 
23.   ## Name of existing retention policy to write to.  Empty string writes to
24.   ## the default retention policy.  Only takes effect when using HTTP.
25.   # retention_policy = ""
26. 
27.   ## Write consistency (clusters only), can be: "any", "one", "quorum", "all".
28.   ## Only takes effect when using HTTP.
29.   # write_consistency = "any"
30. 
31.   ## Timeout for HTTP messages.
32.   timeout = "5s"
33. 
34.   ## HTTP Basic Auth
35.    username = "telegraf"
36.    password = "telegraf"
37. 
38.   ## HTTP User-Agent
39.   # user_agent = "telegraf"
40. 
41.   ## UDP payload size is the maximum packet size to send.
42.   # udp_payload = "512B"
43. 
44.   ## Optional TLS Config for use on HTTP connections.
45.   # tls_ca = "/etc/telegraf/ca.pem"
46.   # tls_cert = "/etc/telegraf/cert.pem"
47.   # tls_key = "/etc/telegraf/key.pem"
48.   ## Use TLS but skip chain & host verification
49.   # insecure_skip_verify = false
50. 
51.   ## HTTP Proxy override, if unset values the standard proxy environment
52.   ## variables are consulted to determine which proxy, if any, should be used.
53.   # http_proxy = "http://corporate.proxy:3128"
54. 
55.   ## Additional HTTP headers
56.   # http_headers = {"X-Special-Header" = "Special-Value"}
57. 
58.   ## HTTP Content-Encoding for write request body, can be set to "gzip" to
59.   ## compress body or "identity" to apply no encoding.
60.   # content_encoding = "identity"
61. 
62.   ## When true, Telegraf will output unsigned integers as unsigned values,
63.   ## i.e.: "42u".  You will need a version of InfluxDB supporting unsigned
64.   ## integer values.  Enabling this option will result in field type errors if
65.   ## existing data has been written.
66.   # influx_uint_support = false

验证数据库


1. [root@localhost tools]# sudo influx
2. Connected to http://localhost:8086 version 1.7.4
3. InfluxDB shell version: 1.7.4
4. Enter an InfluxQL query
5. > use bigscreen
6. Using database bigscreen
7. > SHOW MEASUREMENTS
8. name: measurements
9. name
10. ----
11. bigscreen
12. nvidia_smi
13. win_cpu
14. win_disk
15. win_diskio
16. win_mem
17. win_net
18. win_perf_counters
19. win_swap
20. win_system
21. > SELECT * FROM "win_cpu" limit 1
22. name: win_cpu
23. time                Percent_DPC_Time Percent_Idle_Time Percent_Interrupt_Time Percent_Privileged_Time Percent_Processor_Time Percent_User_Time  host            instance objectname
24. ----                ---------------- ----------------- ---------------------- ----------------------- ---------------------- -----------------  ----            -------- ----------
25. 1552012501000000000 0                81.72647857666016 0                      4.6642279624938965      9.824928283691406      4.6642279624938965 DESKTOP-MLD0KTS 0        Processor

集成Grafana Dashboard

访问 https://grafana.com/dashboards?dataSource=influxdb&collector=Telegraf&search=window 下载一个合适的 Dashboard 模版 性能监控之Telegraf+InfluxDB+Grafana window服务器安装使用_Time_05Grafana 导入 Dashboard 模版 

具体请参考 性能监控之Telegraf+InfluxDB+Grafana服务器实时监控

监控效果

Grafana Dashboard 最终效果如下: 性能监控之Telegraf+InfluxDB+Grafana window服务器安装使用_实时监控_06