InfluxDB是一个当下比较流行的时序数据库,InfluxDB使用 Go 语言编写,无需外部依赖,安装配置非常方便,适合构建大型分布式系统的监控系统。
入门指南
https://jasper-zhang1.gitbooks.io/influxdb/content/Introduction/getting_start.html
操作系统版本
CentOS Linux release 7.2.1511 (Core)
1、下载influxdb安装包
wget https://dl.influxdata.com/influxdb/releases/influxdb-1.6.1.x86_64.rpm
2、安装
yum localinstall influxdb-1.6.1.x86_64.rpm
3、修改默认端口(根据需要)
influxd config命令能输出配置文件得内容
influxd config
vim /etc/influxdb/influxdb.conf
配置文件:
没做太多改动,主要是端口那块设置了一下
[root@VM-centos ~]# cat /etc/influxdb/influxdb.conf
### Welcome to the InfluxDB configuration file.
# The values in this file override the default values used by the system if
# a config option is not specified. The commented out lines are the configuration
# field and the default value used. Uncommenting a line and changing the value
# will change the value used at runtime when the process is restarted.
# Once every 24 hours InfluxDB will report usage data to usage.influxdata.com
# The data includes a random ID, os, arch, version, the number of series and other
# usage data. No data from user databases is ever transmitted.
# Change this option to true to disable reporting.
# reporting-disabled = false
# Bind address to use for the RPC service for backup and restore.
bind-address = "127.0.0.1:18088"
###
### [meta]
###
### Controls the parameters for the Raft consensus group that stores metadata
### about the InfluxDB cluster.
###
[meta]
# Where the metadata/raft database is stored
dir = "/var/lib/influxdb/meta"
# Automatically create a default retention policy when creating a database.
# retention-autocreate = true
# If log messages are printed for the meta service
# logging-enabled = true
###
### [data]
###
### Controls where the actual shard data for InfluxDB lives and how it is
### flushed from the WAL. "dir" may need to be changed to a suitable place
### for your system, but the WAL settings are an advanced configuration. The
### defaults should work for most systems.
###
[data]
# The directory where the TSM storage engine stores TSM files.
dir = "/var/lib/influxdb/data"
# The directory where the TSM storage engine stores WAL files.
wal-dir = "/var/lib/influxdb/wal"
# The amount of time that a write will wait before fsyncing. A duration
# greater than 0 can be used to batch up multiple fsync calls. This is useful for slower
# disks or when WAL write contention is seen. A value of 0s fsyncs every write to the WAL.
# Values in the range of 0-100ms are recommended for non-SSD disks.
# wal-fsync-delay = "0s"
# The type of shard index to use for new shards. The default is an in-memory index that is
# recreated at startup. A value of "tsi1" will use a disk based index that supports higher
# cardinality datasets.
# index-version = "inmem"
# Trace logging provides more verbose output around the tsm engine. Turning
# this on can provide more useful output for debugging tsm engine issues.
# trace-logging-enabled = false
# Whether queries should be logged before execution. Very useful for troubleshooting, but will
# log any sensitive data contained within a query.
# query-log-enabled = true
# Settings for the TSM engine
# CacheMaxMemorySize is the maximum size a shard's cache can
# reach before it starts rejecting writes.
# Valid size suffixes are k, m, or g (case insensitive, 1024 = 1k).
# Values without a size suffix are in bytes.
# cache-max-memory-size = "1g"
# CacheSnapshotMemorySize is the size at which the engine will
# snapshot the cache and write it to a TSM file, freeing up memory
# Valid size suffixes are k, m, or g (case insensitive, 1024 = 1k).
# Values without a size suffix are in bytes.
# cache-snapshot-memory-size = "25m"
# CacheSnapshotWriteColdDuration is the length of time at
# which the engine will snapshot the cache and write it to
# a new TSM file if the shard hasn't received writes or deletes
# cache-snapshot-write-cold-duration = "10m"
# CompactFullWriteColdDuration is the duration at which the engine
# will compact all TSM files in a shard if it hasn't received a
# write or delete
# compact-full-write-cold-duration = "4h"
# The maximum number of concurrent full and level compactions that can run at one time. A
# value of 0 results in 50% of runtime.GOMAXPROCS(0) used at runtime. Any number greater
# than 0 limits compactions to that value. This setting does not apply
# to cache snapshotting.
# max-concurrent-compactions = 0
# The threshold, in bytes, when an index write-ahead log file will compact
# into an index file. Lower sizes will cause log files to be compacted more
# quickly and result in lower heap usage at the expense of write throughput.
# Higher sizes will be compacted less frequently, store more series in-memory,
# and provide higher write throughput.
# Valid size suffixes are k, m, or g (case insensitive, 1024 = 1k).
# Values without a size suffix are in bytes.
# max-index-log-file-size = "1m"
# The maximum series allowed per database before writes are dropped. This limit can prevent
# high cardinality issues at the database level. This limit can be disabled by setting it to
# 0.
# max-series-per-database = 1000000
# The maximum number of tag values per tag that are allowed before writes are dropped. This limit
# can prevent high cardinality tag values from being written to a measurement. This limit can be
# disabled by setting it to 0.
# max-values-per-tag = 100000
# If true, then the mmap advise value MADV_WILLNEED will be provided to the kernel with respect to
# TSM files. This setting has been found to be problematic on some kernels, and defaults to off.
# It might help users who have slow disks in some cases.
# tsm-use-madv-willneed = false
###
### [coordinator]
###
### Controls the clustering service configuration.
###
[coordinator]
# The default time a write request will wait until a "timeout" error is returned to the caller.
# write-timeout = "10s"
# The maximum number of concurrent queries allowed to be executing at one time. If a query is
# executed and exceeds this limit, an error is returned to the caller. This limit can be disabled
# by setting it to 0.
# max-concurrent-queries = 0
# The maximum time a query will is allowed to execute before being killed by the system. This limit
# can help prevent run away queries. Setting the value to 0 disables the limit.
# query-timeout = "0s"
# The time threshold when a query will be logged as a slow query. This limit can be set to help
# discover slow or resource intensive queries. Setting the value to 0 disables the slow query logging.
# log-queries-after = "0s"
# The maximum number of points a SELECT can process. A value of 0 will make
# the maximum point count unlimited. This will only be checked every second so queries will not
# be aborted immediately when hitting the limit.
# max-select-point = 0
# The maximum number of series a SELECT can run. A value of 0 will make the maximum series
# count unlimited.
# max-select-series = 0
# The maxium number of group by time bucket a SELECT can create. A value of zero will max the maximum
# number of buckets unlimited.
# max-select-buckets = 0
###
### [retention]
###
### Controls the enforcement of retention policies for evicting old data.
###
[retention]
# Determines whether retention policy enforcement enabled.
# enabled = true
# The interval of time when retention policy enforcement checks run.
# check-interval = "30m"
###
### [shard-precreation]
###
### Controls the precreation of shards, so they are available before data arrives.
### Only shards that, after creation, will have both a start- and end-time in the
### future, will ever be created. Shards are never precreated that would be wholly
### or partially in the past.
[shard-precreation]
# Determines whether shard pre-creation service is enabled.
# enabled = true
# The interval of time when the check to pre-create new shards runs.
# check-interval = "10m"
# The default period ahead of the endtime of a shard group that its successor
# group is created.
# advance-period = "30m"
###
### Controls the system self-monitoring, statistics and diagnostics.
###
### The internal database for monitoring data is created automatically if
### if it does not already exist. The target retention within this database
### is called 'monitor' and is also created with a retention period of 7 days
### and a replication factor of 1, if it does not exist. In all cases the
### this retention policy is configured as the default for the database.
[monitor]
# Whether to record statistics internally.
# store-enabled = true
# The destination database for recorded statistics
# store-database = "_internal"
# The interval at which to record statistics
# store-interval = "10s"
###
### [http]
###
### Controls how the HTTP endpoints are configured. These are the primary
### mechanism for getting data into and out of InfluxDB.
###
[http]
# Determines whether HTTP endpoint is enabled.
# enabled = true
# The bind address used by the HTTP service.
bind-address = ":18086"
# Determines whether user authentication is enabled over HTTP/HTTPS.
# auth-enabled = false
# The default realm sent back when issuing a basic auth challenge.
# realm = "InfluxDB"
# Determines whether HTTP request logging is enabled.
# log-enabled = true
# Determines whether the HTTP write request logs should be suppressed when the log is enabled.
# suppress-write-log = false
# When HTTP request logging is enabled, this option specifies the path where
# log entries should be written. If unspecified, the default is to write to stderr, which
# intermingles HTTP logs with internal InfluxDB logging.
#
# If influxd is unable to access the specified path, it will log an error and fall back to writing
# the request log to stderr.
# access-log-path = ""
# Determines whether detailed write logging is enabled.
# write-tracing = false
# Determines whether the pprof endpoint is enabled. This endpoint is used for
# troubleshooting and monitoring.
# pprof-enabled = true
# Enables a pprof endpoint that binds to localhost:6060 immediately on startup.
# This is only needed to debug startup issues.
# debug-pprof-enabled = false
# Determines whether HTTPS is enabled.
# https-enabled = false
# The SSL certificate to use when HTTPS is enabled.
# https-certificate = "/etc/ssl/influxdb.pem"
# Use a separate private key location.
# https-private-key = ""
# The JWT auth shared secret to validate requests using JSON web tokens.
# shared-secret = ""
# The default chunk size for result sets that should be chunked.
# max-row-limit = 0
# The maximum number of HTTP connections that may be open at once. New connections that
# would exceed this limit are dropped. Setting this value to 0 disables the limit.
# max-connection-limit = 0
# Enable http service over unix domain socket
# unix-socket-enabled = false
# The path of the unix domain socket.
# bind-socket = "/var/run/influxdb.sock"
# The maximum size of a client request body, in bytes. Setting this value to 0 disables the limit.
# max-body-size = 25000000
# The maximum number of writes processed concurrently.
# Setting this to 0 disables the limit.
# max-concurrent-write-limit = 0
# The maximum number of writes queued for processing.
# Setting this to 0 disables the limit.
# max-enqueued-write-limit = 0
# The maximum duration for a write to wait in the queue to be processed.
# Setting this to 0 or setting max-concurrent-write-limit to 0 disables the limit.
# enqueued-write-timeout = 0
###
### [ifql]
###
### Configures the ifql RPC API.
###
[ifql]
# Determines whether the RPC service is enabled.
# enabled = true
# Determines whether additional logging is enabled.
# log-enabled = true
# The bind address used by the ifql RPC service.
# bind-address = ":8082"
###
### [logging]
###
### Controls how the logger emits logs to the output.
###
[logging]
# Determines which log encoder to use for logs. Available options
# are auto, logfmt, and json. auto will use a more a more user-friendly
# output format if the output terminal is a TTY, but the format is not as
# easily machine-readable. When the output is a non-TTY, auto will use
# logfmt.
# format = "auto"
# Determines which level of logs will be emitted. The available levels
# are error, warn, info, and debug. Logs that are equal to or above the
# specified level will be emitted.
# level = "info"
# Suppresses the logo output that is printed when the program is started.
# The logo is always suppressed if STDOUT is not a TTY.
# suppress-logo = false
###
### [subscriber]
###
### Controls the subscriptions, which can be used to fork a copy of all data
### received by the InfluxDB host.
###
[subscriber]
# Determines whether the subscriber service is enabled.
# enabled = true
# The default timeout for HTTP writes to subscribers.
# http-timeout = "30s"
# Allows insecure HTTPS connections to subscribers. This is useful when testing with self-
# signed certificates.
# insecure-skip-verify = false
# The path to the PEM encoded CA certs file. If the empty string, the default system certs will be used
# ca-certs = ""
# The number of writer goroutines processing the write channel.
# write-concurrency = 40
# The number of in-flight writes buffered in the write channel.
# write-buffer-size = 1000
###
### [[graphite]]
###
### Controls one or many listeners for Graphite data.
###
[[graphite]]
# Determines whether the graphite endpoint is enabled.
# enabled = false
# database = "graphite"
# retention-policy = ""
# bind-address = ":2003"
# protocol = "tcp"
# consistency-level = "one"
# These next lines control how batching works. You should have this enabled
# otherwise you could get dropped metrics or poor performance. Batching
# will buffer points in memory if you have many coming in.
# Flush if this many points get buffered
# batch-size = 5000
# number of batches that may be pending in memory
# batch-pending = 10
# Flush at least this often even if we haven't hit buffer limit
# batch-timeout = "1s"
# UDP Read buffer size, 0 means OS default. UDP listener will fail if set above OS max.
# udp-read-buffer = 0
### This string joins multiple matching 'measurement' values providing more control over the final measurement name.
# separator = "."
### Default tags that will be added to all metrics. These can be overridden at the template level
### or by tags extracted from metric
# tags = ["region=us-east", "zone=1c"]
### Each template line requires a template pattern. It can have an optional
### filter before the template and separated by spaces. It can also have optional extra
### tags following the template. Multiple tags should be separated by commas and no spaces
### similar to the line protocol format. There can be only one default template.
# templates = [
# "*.app env.service.resource.measurement",
# # Default template
# "server.*",
# ]
###
### [collectd]
###
### Controls one or many listeners for collectd data.
###
[[collectd]]
# enabled = false
# bind-address = ":25826"
# database = "collectd"
# retention-policy = ""
#
# The collectd service supports either scanning a directory for multiple types
# db files, or specifying a single db file.
# typesdb = "/usr/local/share/collectd"
#
# security-level = "none"
# auth-file = "/etc/collectd/auth_file"
# These next lines control how batching works. You should have this enabled
# otherwise you could get dropped metrics or poor performance. Batching
# will buffer points in memory if you have many coming in.
# Flush if this many points get buffered
# batch-size = 5000
# Number of batches that may be pending in memory
# batch-pending = 10
# Flush at least this often even if we haven't hit buffer limit
# batch-timeout = "10s"
# UDP Read buffer size, 0 means OS default. UDP listener will fail if set above OS max.
# read-buffer = 0
# Multi-value plugins can be handled two ways.
# "split" will parse and store the multi-value plugin data into separate measurements
# "join" will parse and store the multi-value plugin as a single multi-value measurement.
# "split" is the default behavior for backward compatability with previous versions of influxdb.
# parse-multivalue-plugin = "split"
###
### [opentsdb]
###
### Controls one or many listeners for OpenTSDB data.
###
[[opentsdb]]
# enabled = false
# bind-address = ":4242"
# database = "opentsdb"
# retention-policy = ""
# consistency-level = "one"
# tls-enabled = false
# certificate= "/etc/ssl/influxdb.pem"
# Log an error for every malformed point.
# log-point-errors = true
# These next lines control how batching works. You should have this enabled
# otherwise you could get dropped metrics or poor performance. Only points
# metrics received over the telnet protocol undergo batching.
# Flush if this many points get buffered
# batch-size = 1000
# Number of batches that may be pending in memory
# batch-pending = 5
# Flush at least this often even if we haven't hit buffer limit
# batch-timeout = "1s"
###
### [[udp]]
###
### Controls the listeners for InfluxDB line protocol data via UDP.
###
[[udp]]
# enabled = false
# bind-address = ":8089"
# database = "udp"
# retention-policy = ""
# InfluxDB precision for timestamps on received points ("" or "n", "u", "ms", "s", "m", "h")
# precision = ""
# These next lines control how batching works. You should have this enabled
# otherwise you could get dropped metrics or poor performance. Batching
# will buffer points in memory if you have many coming in.
# Flush if this many points get buffered
# batch-size = 5000
# Number of batches that may be pending in memory
# batch-pending = 10
# Will flush at least this often even if we haven't hit buffer limit
# batch-timeout = "1s"
# UDP Read buffer size, 0 means OS default. UDP listener will fail if set above OS max.
# read-buffer = 0
###
### [continuous_queries]
###
### Controls how continuous queries are run within InfluxDB.
###
[continuous_queries]
# Determines whether the continuous query service is enabled.
# enabled = true
# Controls whether queries are logged when executed by the CQ service.
# log-enabled = true
# Controls whether queries are logged to the self-monitoring data store.
# query-stats-enabled = false
# interval for how often continuous queries will be checked if they need to run
# run-interval = "1s"
###
### [tls]
###
### Global configuration settings for TLS in InfluxDB.
###
[tls]
# Determines the available set of cipher suites. See https://golang.org/pkg/crypto/tls/#pkg-constants
# for a list of available ciphers, which depends on the version of Go (use the query
# SHOW DIAGNOSTICS to see the version of Go used to build InfluxDB). If not specified, uses
# the default settings from Go's crypto/tls package.
# ciphers = [
# "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305",
# "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
# ]
# Minimum version of the tls protocol that will be negotiated. If not specified, uses the
# default settings from Go's crypto/tls package.
# min-version = "tls1.2"
# Maximum version of the tls protocol that will be negotiated. If not specified, uses the
# default settings from Go's crypto/tls package.
# max-version = "tls1.2"
4、进入数据库
influx
注意:
如果修改了端口在启动的时候输入指定端口即可:
influx -host localhost -port 18086
在数据库使用中和常见得Mysql等稍微有些差异
[root@VM-centos ~]# influx -host localhost -port 18086
Connected to http://localhost:18086 version 1.6.1
InfluxDB shell version: 1.6.1
> show databases
name: databases
name
----
_internal
monitor
> use monitor
Using database monitor
> show measurements
name: measurements
name
----
monitor_desktopusing
InfluxDB 的统计记录点
下面来看一下influxdb point的组成:
+-----------+--------+-+---------+-+---------+
|measurement|,tag_set| |field_set| |timestamp|
+-----------+--------+-+---------+-+---------+
measurement 在influxdb中,可以把它等同于sql中的table
tag_set 统计point的tags,所有的tag会被索引,其value值只能使用string,主要于在数据展示中group by之类(后面会讲解如果使用tag和field)
field_set 统计point的fields,该值不会被索引,因此如果对field做查询会比较慢,一般用于记录取值范围较大的数据
timestamp 该point的生成时间,如果不传该值,influxdb在写入该point的时候,会自动生成
InfluxDB 的一次多统计记录点写入
下面来看一个使用http post方式写入数据到influxdb的例子:
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary 'cpu_load_short,host=server02 value=0.67 cpu_load_short,host=server02 value=0.55 cpu_load_short,host=server01 value=2.0'
influxdb的基本使用
influxDB名词
database:数据库;
measurement:数据库中的表;
points:表里面的一行数据。
influxDB中独有的一些概念
Point由时间戳(time)、数据(field)和标签(tags)组成。
time:每条数据记录的时间,也是数据库自动生成的主索引;
fields:各种记录的值;
tags:各种有索引的属性。
还有一个重要的名词:series
所有在数据库中的数据,都需要通过图表来表示,series表示这个表里面的所有的数据可以在图标上画成几条线(注:线条的个数由tags排列组合计算出来)
举个简单的小例子:
假如数据库内数据为(abc为tags):
a=1,b=1,c=1
a=1,b=2,c=1
a=1,b=3,c=1
a=1,b=3,c=1
a=1,b=3,c=1
a=2,b=1,c=1
a=1,b=1,c=1
a=1,b=1,c=1
输入show series from 表名
得到的是:
key
---
表名,a=1,b=1,c=1
表名,a=1,b=2,c=1
表名,a=1,b=3,c=1
表名,a=2,b=1,c=1
也就是看数据能够组成几种排列组合。
influxDB基本操作
数据库与表的操作
#创建数据库
create database "db_name"
#显示所有的数据库
show databases
#删除数据库
drop database "db_name"
#使用数据库
use db_name
#显示该数据库中所有的表
show measurements
#创建表,直接在插入数据的时候指定表名
insert test,host=127.0.0.1,monitor_name=test count=1
#删除表
drop measurement "measurement_name"
增加
> use metrics
Using database metrics
> insert test,host=127.0.0.1,monitor_name=test count=1
查询
> use metrics
Using database metrics
> select * from test order by time desc
其它
SHOW FIELD KEYS --查看当前数据库所有表的字段
SHOW series from pay --查看key数据
SHOW TAG KEYS FROM "pay" --查看key中tag key值
SHOW TAG VALUES FROM "pay" WITH KEY = "merId" --查看key中tag 指定key值对应的值
SHOW TAG VALUES FROM cpu WITH KEY IN ("region", "host") WHERE service = 'redis'
DROP SERIES FROM <measurement_name[,measurement_name]> WHERE <tag_key>='<tag_value>' --删除key
SHOW CONTINUOUS QUERIES --查看连续执行命令
SHOW QUERIES --查看最后执行命令
KILL QUERY <qid> --结束命令
SHOW RETENTION POLICIES ON mydb --查看保留数据
查询数据
SELECT * FROM /.*/ LIMIT 1 --查询当前数据库下所有表的第一行记录
select * from pay order by time desc limit 2
select * from db_name."POLICIES name".measurement_name --指定查询数据库下数据保留中的表数据 POLICIES name数据保留
删除数据
delete from "query" --删除表所有数据,则表就不存在了
drop MEASUREMENT "query" --删除表(注意会把数据保留删除使用delete不会)
DELETE FROM cpu
DELETE FROM cpu WHERE time < '2000-01-01T00:00:00Z'
DELETE WHERE time < '2000-01-01T00:00:00Z'
DROP DATABASE “testDB” --删除数据库
DROP RETENTION POLICY "dbbak" ON mydb --删除保留数据为dbbak数据
DROP SERIES from pay where tag_key='' --删除key中的tag
SHOW SHARDS --查看数据存储文件
DROP SHARD 1
SHOW SHARD GROUPS
SHOW SUBSCRIPTIONS
数据保存策略
influxdb虽然没有删除语句,但是可以设置类似于定期清理的语句。
show retention policies on "db_name"
> show retention policies on "monitor"
name duration shardGroupDuration replicaN default
---- -------- ------------------ -------- -------
autogen 0s 168h0m0s 1 true
>
创建新的Retention Policies
create retention policy "rp_name" on "db_name" duration 3w replication 1 default
rp_name:策略名
db_name:具体的数据库名
3w:保存3周,3周之前的数据将被删除,influxdb具有各种事件参数,比如:h(小时),d(天),w(星期)
replication 1:副本个数,一般为1就可以了
default:设置为默认策略
修改Retention Policies
alter retention policy "rp_name" on "db_name" duration 30d default
删除Retention Policies
drop retention policy "rp_name" on "db_name"
用户管理
#显示用户
show users
#创建用户
create user "username" with password 'password'
#创建管理员权限用户
create user "username" with password 'password' with all privileges
#删除用户
drop user "username"
以时间删除数据
delete from monitor_desktopusing where time < 1631003842029975123
以tag (host)来删除数据
delete from monitor_desktopusing where host='xx.xx.xxx.xxx'
> ALTER RETENTION POLICY "autogen" ON "monitor" DURATION 720h DEFAULT
> SHOW RETENTION POLICIES ON monitor
name duration shardGroupDuration replicaN default
---- -------- ------------------ -------- -------
autogen 720h0m0s 168h0m0s 1 true