创建Docker Swarm集群
官网资料:https://docs.docker.com/engine/swarm/manage-nodes/#add-or-remove-label-metadata
命令:
docker swarm init
执行上面的命令,会在当前节点上创建单节点的swarm集群,docker engine会按照以下方式设置swarm
- 当前节点切换为swarm mode
- 创建一个名为default的swarm
- 将当前节点指定为群的领导者管理器节点
- 使用计算机主机名命名节点
- 将管理器配置为侦听端口2377上的活动网络接口
- 将当前节点设置为Active availability,这意味着它可以从调度程序接收任务
- 为参与swarm的Engines启动内部分布式数据存储,以维护swarm及其上运行的所有服务的一致视图
- 默认情况下,为swarm生成自签名根CA.
- 默认情况下,为worker和manager节点生成令牌以加入swarm
- 创建一个名为ingress的覆盖网络,用于发布swarm外部的服务端口
创建Swarm集群
$ docker swarm init
Swarm initialized: current node (dxn1zf6l61qsb1josjja83ngz) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-49nj1cmql0jkz5s954yi3oex3nedyz0fb0xx14ie39trti4wxv-8vxv8rssmk743ojnwacrr2e7c \
192.168.99.100:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
Manager节点使用advertise 地址来允许swarm中的其他节点访问Swarmkit API并覆盖网络。群集上的其他节点必须能够访问其advertise地址上的管理器节点。
如果Manager节点上有多个IP地址,则需要使用--advertise-addr来指定一个IP。
$ docker swarm init --advertise-addr <MANAGER-IP>
docker swarm init [options]
optinons:
Name, shorthand | Default | Description |
| Advertised address (format: <ip|interface>[:port]) | |
| Enable manager autolocking (requiring an unlock key to start a stopped manager) | |
|
| Availability of the node (“active”|”pause”|”drain”) |
|
| Validity period for node certificates (ns|us|ms|s|m|h) |
| Address or interface to use for data path traffic (format: <ip|interface>) | |
|
| Dispatcher heartbeat period (ns|us|ms|s|m|h) |
| Specifications of one or more certificate signing endpoints | |
| Force create a new cluster from current state | |
|
| 监听地址,默认是0.0.0.0:2377,也可以指定网卡,如--listen-addr eth0:2377 |
| API 1.25+Number of additional Raft snapshots to retain | |
|
| API 1.25+Number of log entries between Raft snapshots |
|
| Task history retention limit |
docker swarm相关命令
Command | Description |
Display and rotate the root CA | |
Initialize a swarm | |
Join a swarm as a node and/or manager | |
Manage join tokens | |
Leave the swarm | |
Unlock swarm | |
Manage the unlock key | |
Update the swarm |
查看swarm集群中添加manager和worker节点的命令和token
docker swarm join-token (manager| worker)
例如:
docker@node1:~$ docker swarm join-token manager
To add a manager to this swarm, run the following command:
docker swarm join --token SWMTKN-1-56p9syspydedeie2b4bn5luuh0pvuyyilxm3o66gsmxj5x9x06-6njv6z5e8zqm6hiofp1j7lojz 192.168.99.100:2377
docker@node1:~$ docker swarm join-token worker
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-56p9syspydedeie2b4bn5luuh0pvuyyilxm3o66gsmxj5x9x06-5ag9p9vjwo4zwemiumm2v3hsj 192.168.99.100:2377
使用--quiet参数只输出token
docker@node1:~$ docker swarm join-token worker --quiet
SWMTKN-1-56p9syspydedeie2b4bn5luuh0pvuyyilxm3o66gsmxj5x9x06-5ag9p9vjwo4zwemiumm2v3hsj
注意:swarm集群的秘钥非常重要,要小心保管,最佳实践是每6个月rotate一次秘钥。如果遇到以下情况,建议立即rotate秘钥
- If a token was checked-in by accident into a version control system, group chat or accidentally printed to your logs.
- If you suspect a node has been compromised.
- If you wish to guarantee that no new nodes can join the swarm.
rotate秘钥
执行以下命令来rotate秘钥
docker swarm join-token rotate
如果指定要为worker或者manager 轮转秘钥,执行以下命令
docker swarm join-token rotate worker|manager
加入swarm集群
官网资料:https://docs.docker.com/engine/swarm/join-nodes/
查看加入worker节点的命令和token
$ docker swarm join-token worker
查看加入manager节点的命令和token
$ docker swarm join-token worker
Docker建议每个群集使用三个或五个管理器节点来实现高可用性。由于swarm模式管理器节点使用Raft共享数据,因此必须有奇数个管理器。
Swarm集群中的Manager节点
官网资料:https://docs.docker.com/engine/swarm/manage-nodes/#add-or-remove-label-metadata
manager节点主要有以下功能
查看集群中有哪些node,执行docker node ls命令
docker@node1:~$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
ynts1kkxwcag1gh4scqa4ksn2 * node1 Ready Active Reachable 18.05.0-ce
zkgq54xua406xftgqx5ata9cy node2 Ready Active Reachable 18.05.0-ce
qu87mli2opl3svm99bpfcpvs0 node3 Ready Active 18.05.0-ce
kasxp5172plh5aqck76aao4gg node4 Ready Active Leader 18.05.0-ce
AVAILABILITY列显示调度程序是否可以将任务分配给节点:
Active
表示调度程序可以将任务分配给节点Pause
表示调度程序不会将新任务分配给节点,但现有任务仍在运行Drain
表示调度程序不会将新任务分配给节点。调度程序关闭所有现有任务并在可用节点上调度它们
MANAGER STATUS列显示节点参与Raft共识:
- 没有值表示不参与群组管理的工作节点
Leader
表示节点是主管理器节点,它为群集做出所有群集管理和编排决策。Reachable
表示该节点是参与Raft共识仲裁的管理员节点。如果领导节点变得不可用,则该节点有资格被选为新领导者。.Unavailable
表示节点是无法与其他管理器通信的管理器。如果管理器节点变得不可用,您应该将新的管理器节点加入到群组中,或者将工作节点提升为管理器。
查看单个节点的详细信息
$ docker node inspect <NODE-ID>
输出默认为JSON格式,但您可以传递--pretty标志以人类可读的格式打印结果
例如:
docker@node1:~$ docker node inspect node1 --pretty
ID: ynts1kkxwcag1gh4scqa4ksn2
Hostname: node1
Joined at: 2018-07-03 04:26:21.704201719 +0000 utc
Status:
State: Ready
Availability: Active
Address: 192.168.99.100
Manager Status:
Address: 192.168.99.100:2377
Raft Status: Reachable
Leader: No
Platform:
Operating System: linux
Architecture: x86_64
Resources:
CPUs: 1
Memory: 995.6MiB
Plugins:
Log: awslogs, fluentd, gcplogs, gelf, journald, json-file, logentries, splunk, syslog
Network: bridge, host, macvlan, null, overlay
Volume: local
Engine Version: 18.05.0-ce
Engine Labels:
- provider=virtualbox
TLS Info:
TrustRoot:
-----BEGIN CERTIFICATE-----
MIIBajCCARCgAwIBAgIUamHxAfppMX348BrueTr+dCz5ARYwCgYIKoZIzj0EAwIw
EzERMA8GA1UEAxMIc3dhcm0tY2EwHhcNMTgwNzA0MDk0NTAwWhcNMzgwNjI5MDk0
NTAwWjATMREwDwYDVQQDEwhzd2FybS1jYTBZMBMGByqGSM49AgEGCCqGSM49AwEH
A0IABMC8bm5gttTZKIhnLh80R5Y4CaPk6TP36+RivAi1EtT1qWK9v5kYGABFeyc1
CgwQwf7xqzyj148HX5uWzrSvPbOjQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNVHRMB
Af8EBTADAQH/MB0GA1UdDgQWBBSAc42YrBt8loVFLwrnNYe/DklXMDAKBggqhkjO
PQQDAgNIADBFAiEAmAU0bwgtkGqMH2HVcPT46q82vbx0AzM0boNpliLs9w4CIA0T
5N7eOpcmxC4seiyRo/Z9m/K+1+eeErEpOrh5LOsf
-----END CERTIFICATE-----
Issuer Subject: MBMxETAPBgNVBAMTCHN3YXJtLW
Nh
Issuer Public Key: MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEwLxubmC21NkoiGcuHzRHljgJo+TpM/fr5GK8CLUS1PWpYr2/mRgYAEV7JzUKDBDB/vGrPKPXjwdfm5bOtK89sw==
update节点
可以按照以下方式更新node
- change node availability 更新节点可用性
- add or remove label metadata 添加或删除元数据标签
- change a node role 更改节点的角色
更改节点的可用性
- drain a manager node so that only performs swarm management tasks and is unavailable for task assignment. 排空manager节点,以便仅执行群集管理任务,并且不可用于任务分配
- drain a node so you can take it down for maintenance. 将节点设置为drain,用来维护
- pause a node so it can’t receive new tasks. 暂停一个节点,不再接收新的task
- restore unavailable or paused nodes available status. 恢复不可用或暂停的节点可用状态
例如:将node-1节点可用性设置为drain
docker node update --availability drain node-1
提升或者降级节点
将node-3 ndoe-2提升为manager
docker node promote node-3 node-2
Node node-3 promoted to a manager in the swarm.
Node node-2 promoted to a manager in the swarm.
将node-3 node-2降级为worker
$ docker node demote node-3 node-2
Manager node-3 demoted in the swarm.
Manager node-2 demoted in the swarm.
还可以使用docker node update --role manager 和 docker node update --role worker
Leave the swarm
如果要将节点从swarm集群中删除,执行以下命令
$ docker swarm leave
当节点离开群集时,Docker引擎停止在群集模式下运行。协调器不再将任务调度到节点。如果节点是管理器节点,则会收到有关维护仲裁的警告。要覆盖警告,使用--force来强制删除。
节点离开swarm后,可以在管理器节点上运行下面的命令,从节点列表中删除节点
$ docker node rm
转载于:https://blog.51cto.com/zengestudy/2136515