etcd介绍

系统要求

由于 etcd 将数据写入磁盘,因此其性能很大程度上取决于磁盘性能。因此,强烈推荐使用 SSD。要评估磁盘是否足够快用于 etcd,一种可能性是使用磁盘基准测试工具,例如fio。为了防止性能下降或无意中使键值存储超载,etcd 强制将可配置的存储大小配额默认设置为 2GB。为避免交换或内存不足,机器应至少有足够多的 RAM 来覆盖配额。8GB 是正常环境的建议最大大小,如果配置的值超过该值,etcd 会在启动时发出警告。在 CoreOS,etcd 集群通常部署在具有双核处理器、2GB RAM 和至少 80GB SSD 的专用 CoreOS Container Linux 机器上。请注意,性能本质上取决于工作负载;请在生产部署之前进行测试。

为什么是奇数个集群成员?

一个 etcd 集群需要大多数节点(一个仲裁)来就集群状态的更新达成一致。对于具有 n 个成员的集群,quorum 为 (n/2)+1。对于任何奇数大小的集群,添加一个节点总是会增加仲裁所需的节点数。尽管将节点添加到奇数大小的集群看起来更好,因为有更多的机器,但容错性更差,因为完全相同数量的节点可能会失败而不会丢失仲裁,但是有更多的节点可能会失败。如果集群处于无法容忍更多故障的状态,在删除节点之前添加节点是危险的,因为如果新节点无法在集群中注册(例如,地址配置错误),quorum 将永久丢失

最大集群大小是多少?

理论上,没有硬性限制。然而,一个 etcd 集群可能不应该超过七个节点。谷歌 Chubby 锁服务,类似于 etcd,并在谷歌内部广泛部署多年,建议运行五个节点。一个 5 成员的 etcd 集群可以容忍两个成员的故障,这在大多数情况下就足够了。尽管较大的集群提供了更好的容错能力,但写入性能会受到影响,因为必须在更多机器上复制数据

我应该在删除不健康的成员之前添加一个成员吗?

替换 etcd 节点时,重要的是先删除成员,然后添加其替换

为什么 etcd 会因磁盘延迟峰值而失去其领导者?

这是故意的;磁盘延迟是领导者活跃度的一部分。假设集群领导者需要一分钟时间将 raft 日志更新同步到磁盘,但 etcd 集群有一秒的选举超时。即使领导者可以在选举间隔内处理网络消息(例如,发送心跳),它实际上是不可用的,因为它不能提交任何新提案;它正在慢速磁盘上等待。如果集群由于磁盘延迟而频繁失去其领导者,请尝试调整磁盘设置或 etcd 时间参数 

etcd集群搭建

环境:一台物理机,通过不同的端口跑出3个节点的etcd集群,建议奇数节点,以防止脑裂 

第一步,下载etcd安装包

wget -c https://github.com/etcd-io/etcd/releases/download/v3.5.2/etcd-v3.5.2-linux-amd64.tar.gz

第二步,解压,然后新建配置etcd1.conf文件

name: etcd-1
data-dir: /root/etcd1/data 
listen-client-urls: http://0.0.0.0:2379
advertise-client-urls: http://127.0.0.1:2379
listen-peer-urls: http://0.0.0.0:2380
initial-advertise-peer-urls: http://127.0.0.1:2380
initial-cluster: etcd-1=http://127.0.0.1:2380,etcd-2=http://127.0.0.1:2480,etcd-3=http://127.0.0.1:2580
initial-cluster-token: etcd-cluster-my
initial-cluster-state: new

etcd2.conf配置文件

name: etcd-2
data-dir: /root/etcd2/data
listen-client-urls: http://0.0.0.0:2479
advertise-client-urls: http://127.0.0.1:2479
listen-peer-urls: http://0.0.0.0:2480
initial-advertise-peer-urls: http://127.0.0.1:2480
initial-cluster: etcd-1=http://127.0.0.1:2380,etcd-2=http://127.0.0.1:2480,etcd-3=http://127.0.0.1:2580
initial-cluster-token: etcd-cluster-my
initial-cluster-state: new

etcd3.conf配置文件

name: etcd-3
data-dir: /root/etcd3/data 
listen-client-urls: http://0.0.0.0:2579
advertise-client-urls: http://127.0.0.1:2579
listen-peer-urls: http://0.0.0.0:2580
initial-advertise-peer-urls: http://127.0.0.1:2580
initial-cluster: etcd-1=http://127.0.0.1:2380,etcd-2=http://127.0.0.1:2480,etcd-3=http://127.0.0.1:2580
initial-cluster-token: etcd-cluster-my
initial-cluster-state: new

编辑脚本

#!/bin/bash
CRTDIR=$(pwd)
servers=("etcd1" "etcd2" "etcd3")
for server in ${servers[@]}
do
cd ${CRTDIR}/$server
nohup ./etcd --config-file=etcd.conf &
echo $?
done

运行脚本后查看进程和端口

etcd postgres 对磁盘要求 etcd磁盘性能_centos

 查看集群状态

[root@VM-0-15-centos etcd1]# ./etcdctl --write-out=table --endpoints=127.0.0.1:2379,127.0.0.1:2479,127.0.0.1:2579 endpoint status
+----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|    ENDPOINT    |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 127.0.0.1:2379 | 47a42fb96a975854 |   3.5.2 |   20 kB |     false |      false |         4 |         31 |                 31 |        |
| 127.0.0.1:2479 | 72ab37cc61e2023b |   3.5.2 |   20 kB |     false |      false |         4 |         31 |                 31 |        |
| 127.0.0.1:2579 | 470f778210a711ed |   3.5.2 |   20 kB |      true |      false |         4 |         31 |                 31 |        |
+----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
[root@VM-0-15-centos etcd1]# ./etcdctl --endpoints=$ENDPOINTS endpoint health
127.0.0.1:2579 is healthy: successfully committed proposal: took = 5.033785ms
127.0.0.1:2479 is healthy: successfully committed proposal: took = 5.003254ms
127.0.0.1:2379 is healthy: successfully committed proposal: took = 4.990036ms
[root@VM-0-15-centos etcd1]# ./etcdctl -w table member list
+------------------+---------+--------+-----------------------+-----------------------+------------+
|        ID        | STATUS  |  NAME  |      PEER ADDRS       |     CLIENT ADDRS      | IS LEARNER |
+------------------+---------+--------+-----------------------+-----------------------+------------+
| 470f778210a711ed | started | etcd-3 | http://127.0.0.1:2580 | http://127.0.0.1:2579 |      false |
| 47a42fb96a975854 | started | etcd-1 | http://127.0.0.1:2380 | http://127.0.0.1:2379 |      false |
| 72ab37cc61e2023b | started | etcd-2 | http://127.0.0.1:2480 | http://127.0.0.1:2479 |      false |
+------------------+---------+--------+-----------------------+-----------------------+------------+

 

日常操作

下面的ENDPOINTS="127.0.0.1:2379,127.0.0.1:2479,127.0.0.1:2579"

#添加数据
[root@VM-0-15-centos etcd1]# ./etcdctl put /etc/password 123456

#删除数据
[root@VM-0-15-centos etcd1]# ./etcdctl del /etc/password
--data-dir
etcutl defrag
--endpoints
--cluster
./etcdctl --endpoints=localhost:2379,badendpoint:2379 defrag
./etcdctl defrag --cluster

 etcd v2迁移到etcd v3

# write key in etcd version 2 store
export ETCDCTL_API=2
etcdctl --endpoints=http://$ENDPOINT set foo bar

# read key in etcd v2
etcdctl --endpoints=$ENDPOINTS --output="json" get foo

# stop etcd node to migrate, one by one

# migrate v2 data
export ETCDCTL_API=3
etcdctl --endpoints=$ENDPOINT migrate --data-dir="default.etcd" --wal-dir="default.etcd/member/wal"

# restart etcd node after migrate, one by one

# confirm that the key got migrated
etcdctl --endpoints=$ENDPOINTS get /foo