orchestrator安装部署
一、orchestrator简述
1、什么是orchestrator
Orchestrator是一款开源,对MySQL复制提供高可用、拓扑的可视化管理工具,采用go语言编写,它能够主动发现当前拓扑结构和主从复制状态,支持MySQL主从复制拓扑关系的调整、支持MySQL主库故障自动切换(failover)、手动主从切换(switchover)等功能。
Orchestrator后台依赖于MySQL或者SQLite存储元数据,能够提供Web界面展示MySQL集群的拓扑关系及实例状态,通过Web界面可更改MySQL实例的部分配置信息,同时也提供命令行和api接口,以便更加灵活的自动化运维管理。Orchestrator 对MySQL主库的故障切换分为自动切换和手动切换。手动切换又分为recover、force-master-failover、force-master-takeover以及graceful-master-takeover。
相比于MHA,Orchestrator更加偏重于复制拓扑关系的管理,能够实现MySQL任一复制拓扑关系的调整,并在此基础上,实现MySQL高可用。另外,Orchestrator自身也可以部署多个节点,通过raft分布式一致性协议,保证自身的高可用。
2、orchestrator的特点和作用
1.检测和审查复制集群,可以主动发现当前的拓扑结构和主从复制状态,读取基本的MySQL信息如复制状态和配置。
2.安全拓扑重构:转移服务于另外一台计算机的系统拓扑,orchestrator了解复制规则,可以解析binlog文件中的position、GTID、Pseudo GTID、binlog服务器 ,整洁的拓扑可视化 ,复制问题可视化 ,通过简单的拖拽修改拓扑。
3.故障修复,根据拓扑信息,可以识别各种故障,根据配置可以执行自动恢复。
3、原理图
二、Orchestrator测试环境规划
IP地址 | 主机名 | 操作系统 | MySQL版本 | MySQL端口 |
192.168.220.107 | orche-1 | centos7.9 | 5.7.40 | 3306 |
192.168.220.10 | orche-2 | centos7.9 | 5.7.40 | 3306 |
192.168.220.11 | orche-3 | centos7.9 | 5.7.40 | 3306 |
192.168.220.110 | master | centos7.9 | 5.7.40 | 3306 |
192.168.220.100 | slave-1 | centos7.9 | 5.7.40 | 3306 |
192.168.220.200 | slave-2 | centos7.9 | 5.7.40 | 3306 |
三、MySQL部署过程
1、六台服务器安装MySQL软件(三台搭建主从复制,三台搭建orchestrator高可用集群)
2、三台服务器进行MySQL主从搭建(一主两从,搭建过程略,之前博客有发过)
四、Orchestrator集群部署过程(三台都做相同操作)
1、安装jq工具
# 安装epel源
[root@orche ~]# yum install epel-release -y
# 安装jq工具
[root@orche ~]# yum install jq -y
2、安装orchestrator
下载地址:https://github.com/openark/orchestrator/releases
[root@orche ~]# cd /orchestrator/
[root@orche orchestrator]# ls
orchestrator-3.2.6-1.x86_64.rpm orchestrator-client-3.2.6-1.x86_64.rpm
[root@orche orchestrator]# rpm -ivh orchestrator-3.2.6-1.x86_64.rpm
准备中... ################################# [100%]
正在升级/安装...
1:orchestrator-1:3.2.6-1 ################################# [100%]
[root@orche orchestrator]# rpm -ivh orchestrator-client-3.2.6-1.x86_64.rpm
准备中... ################################# [100%]
正在升级/安装...
1:orchestrator-client-1:3.2.6-1 ################################# [100%]
# 安装完成,查看
[root@orche orchestrator]# rpm -qa | grep orchestrator
orchestrator-client-3.2.6-1.x86_64
orchestrator-3.2.6-1.x86_64
3、在orchestrator的mysql创建元数据库和用户
在orchestrator服务器(192.168.220.107)上创建,其余两台orchestrator服务器做相同操作
# 登录MySQL
[root@orche ~]# mysql -uroot -p
# 创建元数据库和用户
root@(none) 21:01 mysql>create database orchestrator;
root@(none) 21:01 mysql>create user 'orchestrator'@'%' identified by 'mysql';
root@(none) 21:01 mysql>grant all on orchestrator.* to 'orchestrator'@'%';
4、在mysql主库上创建用户
在master服务器(192.168.220.110)上创建,数据会同步给两个从服务器
root@(none) 21:02 mysql>GRANT SELECT, RELOAD, PROCESS, SUPER, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'orchestrator'@'%' identified by 'mysql';
5、编辑配置文件
在192.168.220.107上操作,拷贝一个模板的过来进行修改,其余两台orchestrator服务器做相同操作
[root@orche ~]# cd /usr/local/orchestrator/
[root@orche orchestrator]# cp orchestrator-sample.conf.json orchestrator.conf.json
[root@orche orchestrator]# cat orchestrator.conf.json
{
"Debug": true,
"EnableSyslog": false,
"ListenAddress": ":3000",
"MySQLTopologyUser": "orchestrator",
"MySQLTopologyPassword": "mysql",
"MySQLTopologyCredentialsConfigFile": "",
"MySQLTopologySSLPrivateKeyFile": "",
"MySQLTopologySSLCertFile": "",
"MySQLTopologySSLCAFile": "",
"MySQLTopologySSLSkipVerify": true,
"MySQLTopologyUseMutualTLS": false,
"MySQLOrchestratorHost": "192.168.220.107", # orchestrator服务器的主机号是什么就填什么
"MySQLOrchestratorPort": 3306,
"MySQLOrchestratorDatabase": "orchestrator",
"MySQLOrchestratorUser": "orchestrator",
"MySQLOrchestratorPassword": "mysql",
"MySQLOrchestratorCredentialsConfigFile": "",
"MySQLOrchestratorSSLPrivateKeyFile": "",
"MySQLOrchestratorSSLCertFile": "",
"MySQLOrchestratorSSLCAFile": "",
"MySQLOrchestratorSSLSkipVerify": true,
"MySQLOrchestratorUseMutualTLS": false,
"MySQLConnectTimeoutSeconds": 1,
# 模板默认是没有raft配置项的,需要自行加入
"RaftEnabled": true,
"RaftDataDir": "/data/orchestrator",
"RaftBind": "192.168.220.107", # orchestrator服务器的主机号是什么就填什么
"DefaultRaftPort": 10008,
"RaftNodes": [
"192.168.220.10",
"192.168.220.11",
"192.168.220.107"
],
"DefaultInstancePort": 3306,
"DiscoverByShowSlaveHosts": true,
"InstancePollSeconds": 5,
"DiscoveryIgnoreReplicaHostnameFilters": [
"a_host_i_want_to_ignore[.]example[.]com",
".*[.]ignore_all_hosts_from_this_domain[.]example[.]com",
"a_host_with_extra_port_i_want_to_ignore[.]example[.]com:3307"
],
"UnseenInstanceForgetHours": 240,
"SnapshotTopologiesIntervalHours": 0,
"InstanceBulkOperationsWaitTimeoutSeconds": 10,
"HostnameResolveMethod": "default",
"MySQLHostnameResolveMethod": "@@hostname",
"SkipBinlogServerUnresolveCheck": true,
"ExpiryHostnameResolvesMinutes": 60,
"RejectHostnameResolvePattern": "",
"ReasonableReplicationLagSeconds": 10,
"ProblemIgnoreHostnameFilters": [],
"VerifyReplicationFilters": false,
"ReasonableMaintenanceReplicationLagSeconds": 20,
"CandidateInstanceExpireMinutes": 60,
"AuditLogFile": "",
"AuditToSyslog": false,
"RemoveTextFromHostnameDisplay": ".mydomain.com:3306",
"ReadOnly": false,
"AuthenticationMethod": "",
"HTTPAuthUser": "",
"HTTPAuthPassword": "",
"AuthUserHeader": "",
"PowerAuthUsers": [
"*"
],
"ClusterNameToAlias": {
"127.0.0.1": "test suite"
},
"ReplicationLagQuery": "",
"DetectClusterAliasQuery": "SELECT SUBSTRING_INDEX(@@hostname, '.', 1)",
"DetectClusterDomainQuery": "",
"DetectInstanceAliasQuery": "",
"DetectPromotionRuleQuery": "",
"DataCenterPattern": "[.]([^.]+)[.][^.]+[.]mydomain[.]com",
"PhysicalEnvironmentPattern": "[.]([^.]+[.][^.]+)[.]mydomain[.]com",
"PromotionIgnoreHostnameFilters": [],
"DetectSemiSyncEnforcedQuery": "",
"ServeAgentsHttp": false,
"AgentsServerPort": ":3001",
"AgentsUseSSL": false,
"AgentsUseMutualTLS": false,
"AgentSSLSkipVerify": false,
"AgentSSLPrivateKeyFile": "",
"AgentSSLCertFile": "",
"AgentSSLCAFile": "",
"AgentSSLValidOUs": [],
"UseSSL": false,
"UseMutualTLS": false,
"SSLSkipVerify": false,
"SSLPrivateKeyFile": "",
"SSLCertFile": "",
"SSLCAFile": "",
"SSLValidOUs": [],
"URLPrefix": "",
"StatusEndpoint": "/api/status",
"StatusSimpleHealth": true,
"StatusOUVerify": false,
"AgentPollMinutes": 60,
"UnseenAgentForgetHours": 6,
"StaleSeedFailMinutes": 60,
"SeedAcceptableBytesDiff": 8192,
"PseudoGTIDPattern": "",
"PseudoGTIDPatternIsFixedSubstring": false,
"PseudoGTIDMonotonicHint": "asc:",
"DetectPseudoGTIDQuery": "",
"BinlogEventsChunkSize": 10000,
"SkipBinlogEventsContaining": [],
"ReduceReplicationAnalysisCount": true,
"FailureDetectionPeriodBlockMinutes": 60,
"FailMasterPromotionOnLagMinutes": 0,
"RecoveryPeriodBlockSeconds": 3600,
"RecoveryIgnoreHostnameFilters": [],
# 以下2项需要设置为"*",否则不会发生自动切换
"RecoverMasterClusterFilters": [
"*"
],
"RecoverIntermediateMasterClusterFilters": [
"*"
],
"OnFailureDetectionProcesses": [
"echo 'Detected {failureType} on {failureCluster}. Affected replicas: {countSlaves}' >> /tmp/recovery.log"
],
"PreGracefulTakeoverProcesses": [
"echo 'Planned takeover about to take place on {failureCluster}. Master will switch to read_only' >> /tmp/recovery.log"
],
"PreFailoverProcesses": [
"echo 'Will recover from {failureType} on {failureCluster}' >> /tmp/recovery.log"
],
"PostFailoverProcesses": [
"echo '(for all types) Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort}' >> /tmp/recovery.log"
],
"PostUnsuccessfulFailoverProcesses": [],
"PostMasterFailoverProcesses": [
"echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Promoted: {successorHost}:{successorPort}' >> /tmp/recovery.log"
],
"PostIntermediateMasterFailoverProcesses": [
"echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort}' >> /tmp/recovery.log"
],
"PostGracefulTakeoverProcesses": [
"echo 'Planned takeover complete' >> /tmp/recovery.log"
],
"CoMasterRecoveryMustPromoteOtherCoMaster": true,
"DetachLostSlavesAfterMasterFailover": true,
"ApplyMySQLPromotionAfterMasterFailover": true,
"PreventCrossDataCenterMasterFailover": false,
"PreventCrossRegionMasterFailover": false,
"MasterFailoverDetachReplicaMasterHost": false,
"MasterFailoverLostInstancesDowntimeMinutes": 0,
"PostponeReplicaRecoveryOnLagMinutes": 0,
"OSCIgnoreHostnameFilters": [],
"GraphiteAddr": "",
"GraphitePath": "",
"GraphiteConvertHostnameDotsToUnderscores": true,
"ConsulAddress": "",
"ConsulAclToken": "",
"ConsulKVStoreProvider": "consul"
}
6、启动orchestrator(三台orchestrator一起运行)
# 三个节点组成的raft,死掉1个节点不影响程序运行,可以正常切换,死掉2个节点的话,orc不可用,也不能切换
[root@orche ~]# cd /usr/local/orchestrator/
# 在后台运行脚本
[root@orche orchestrator]# [root@orche orchestrator]# nohup ./orchestrator http &
# 查看进程
[root@orche orchestrator]# ps aux | grep orchestrator
root 116243 0.5 1.5 916220 29452 ? Sl 10:12 3:55 ./orchestrator http
root 126975 0.0 0.0 112808 968 pts/0 S+ 21:17 0:00 grep --color=auto orchestrato
访问查看orchestrator的web页面
访问http://192.168.220.107:3000/,或者http://192.168.220.10:3000/,http://192.168.220.11:3000/
需要提前导入
导入成功