介绍:

MHA架构至少需要3台服务器.两台Master.其中一台位主master,另外一台为备master,第三台位slave机器.

MHA两部分构成.MHA manager和MHA data.管理节点和数据节点.

MAH在自动故障切换的过程中,MHA试图从的宕机的主服务器上保存二进制日志,最大程度的保证数据不丢失.但是数据肯定会丢失的.(主服务器硬件故障/无法通过ssh访问)


MHA的切换过程

(1)从宕机崩溃的master保存二进制日志事件(binlog events);
(2)识别含有最新更新的slave;
(3)应用差异的中继日志(relay log)到其他的slave;
(4)应用从master保存的二进制日志事件(binlog events);
(5)提升一个slave为新的master;
(6)使其他的slave连接新的master进行复制;


一:准备规划机器配置:

我使用了5台服务器.虚拟环境.

192.168.0.221    manager
192.168.0.222    master1
192.168.0.223    master2
192.168.0.224    slave1
192.168.0.225    slave2



0.安装前执行脚本

#!/bin/bash
service  iptables  stop
chkconfig  iptables  off
iptables  -F
sed -i  's/SELINUX=enforcing/SELINUX=disabled/g'  /etc/sysconfig/selinux
yum  install  wget  gcc  gcc-c++   -y

1.配置hosts文件

cat   >> /etc/hosts  << EOF
192.168.0.221    manager
192.168.0.222    master1
192.168.0.223    master2
192.168.0.224    slave1
192.168.0.225    slave2
EOF
[root@manager ~]# scp -o StrictHostKeyChecking=no /etc/hosts root@master1:/etc/
[root@manager ~]# scp -o StrictHostKeyChecking=no /etc/hosts root@master2:/etc/
[root@manager ~]# scp -o StrictHostKeyChecking=no /etc/hosts root@slave1:/etc/
[root@manager ~]# scp -o StrictHostKeyChecking=no /etc/hosts root@slave2:/etc/


2.ssh无密码登陆

主机:manager

ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub root@master1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@master2
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave2


主机:master1

ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub root@manager
ssh-copy-id -i ~/.ssh/id_rsa.pub root@master2
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave2


主机:master2

ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub root@manager
ssh-copy-id -i ~/.ssh/id_rsa.pub root@master1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave2


主机:slave1

ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub root@manager
ssh-copy-id -i ~/.ssh/id_rsa.pub root@master1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@master2
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave2


主机:slave2

ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub root@manager
ssh-copy-id -i ~/.ssh/id_rsa.pub root@master1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@master2
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave1


二:规划mysql数据库


1.在master1、master2和slave1和slave2上安装mysql服务

http://yujianglei.blog.51cto.com/7215578/1725585

更改密码:
mysqladmin -uroot password '123'


2.配置master1、master2、slave1和slave2之间的主从复制


主机:master1

[root@master1 ~]# egrep "log-bin|server-id" /etc/my.cnf
log-bin=/mydata/bin_log/mysql-bin
server-id=221


主机:master2

[root@master2 ~]# egrep "log-bin|server-id" /etc/my.cnf
log-bin=/mydata/bin_log/mysql-bin
server-id=222


主机:slave1

[root@slave1 ~]# egrep "log-bin|server-id" /etc/my.cnf
log-bin=/mydata/bin_log/mysql-bin
server-id=224


主机:slave2

[root@slave2 ~]# egrep "log-bin|server-id" /etc/my.cnf
log-bin=/mydata/bin_log/mysql-bin
server-id=225


3.在master1、master2上创建主从同步的账号。master2是备用master,这个也需要建立授权用户

[root@master1 ~]# mysql -uroot -p123 -e "grant all privileges on *.* to 'rep'@'192.168.0.%' identified by 'rep123';flush privileges"
[root@master2 ~]# mysql -uroot -p123 -e "grant all privileges on *.* to 'rep'@'192.168.0.%' identified by 'rep123';flush privileges"
mysql -uroot -p123 -e "select User,Host,Password  from mysql.user"

4、在master1上执行命令,查看master状态信息

[root@master1 ~]# mysql -uroot -p123 -e "show  master  status"
Warning: Using a password on the command line interface can be insecure.
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |      545 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+

5、在master2、slave1、slave2上执行主从同步

在master1上执行命令:

mysql> CHANGE MASTER TO
    -> MASTER_HOST='192.168.0.222',
    -> MASTER_USER='rep',
    -> MASTER_PASSWORD='rep123',
    -> MASTER_PORT=3306,
    -> MASTER_LOG_FILE='mysql-bin.000001',
    -> MASTER_LOG_POS=545;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start  slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.222
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 545
               Relay_Log_File: master2-relay-bin.000002
                Relay_Log_Pos: 283
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 545
              Relay_Log_Space: 458
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 221
                  Master_UUID: 5dea20fa-942d-11e5-93c3-001c42ccc5ea
             Master_Info_File: /mydata/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
1 row in set (0.00 sec)
ERROR:
No query specified


在slave1上执行命令:

mysql> CHANGE MASTER TO
    -> MASTER_HOST='192.168.0.222',
    -> MASTER_USER='rep',
    -> MASTER_PASSWORD='rep123',
    -> MASTER_PORT=3306,
    -> MASTER_LOG_FILE='mysql-bin.000001',
    -> MASTER_LOG_POS=545;
Query OK, 0 rows affected, 2 warnings (0.02 sec)
mysql> start  slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show  slave  status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.222
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 545
               Relay_Log_File: slave1-relay-bin.000002
                Relay_Log_Pos: 283
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 545
              Relay_Log_Space: 457
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 221
                  Master_UUID: 5dea20fa-942d-11e5-93c3-001c42ccc5ea
             Master_Info_File: /mydata/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
1 row in set (0.00 sec)
ERROR:
No query specified


在slave2上执行命令:

mysql> CHANGE MASTER TO
    -> MASTER_HOST='192.168.0.222',
    -> MASTER_USER='rep',
    -> MASTER_PASSWORD='rep123',
    -> MASTER_PORT=3306,
    -> MASTER_LOG_FILE='mysql-bin.000001',
    -> MASTER_LOG_POS=545;
Query OK, 0 rows affected, 2 warnings (0.02 sec)
mysql> start  slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show  slave  status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.222
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 545
               Relay_Log_File: slave2-relay-bin.000002
                Relay_Log_Pos: 283
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 545
              Relay_Log_Space: 457
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 221
                  Master_UUID: 5dea20fa-942d-11e5-93c3-001c42ccc5ea
             Master_Info_File: /mydata/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
1 row in set (0.00 sec)


三:规划MHA

1、创建mha管理用的复制账号,每台数据库上都要创建5个账号,在这里以其中master1为例

[root@master ~]# mysql -uroot -p123
mysql> grant all privileges on *.* to 'mha_rep'@'192.168.0.221' identified by 'mha123';
mysql> grant all privileges on *.* to 'mha_rep'@'192.168.0.222' identified by 'mha123';
mysql> grant all privileges on *.* to 'mha_rep'@'192.168.0.223' identified by 'mha123';
mysql> grant all privileges on *.* to 'mha_rep'@'192.168.0.224' identified by 'mha123';
mysql> grant all privileges on *.* to 'mha_rep'@'192.168.0.225' identified by 'mha123';
mysql> flush privileges;
mysql> select user,host,password from mysql.user where user='mha_rep';
+---------+---------------+-------------------------------------------+
| user    | host          | password                                  |
+---------+---------------+-------------------------------------------+
| mha_rep | 192.168.0.221 | *7D80375C32408CA76BB965C11657023AABF22F0A |
| mha_rep | 192.168.0.222 | *7D80375C32408CA76BB965C11657023AABF22F0A |
| mha_rep | 192.168.0.223 | *7D80375C32408CA76BB965C11657023AABF22F0A |
| mha_rep | 192.168.0.224 | *7D80375C32408CA76BB965C11657023AABF22F0A |
+---------+---------------+-------------------------------------------+
4 rows in set (0.00 sec)

2、在4台主机上(master、slave01和slave02)上分别安装mha4mysql-node包,这里以master为例,其它主机同理。manager和node都安装


[root@master1 ~]# rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
[root@master1 ~]# yum clean all
[root@master1 ~]# yum makecache
[root@master1 ~]# rpm –import /etc/pki/rpm-gpg/*
[root@master1 ~]# yum install perl perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Config-IniFiles  ncftp perl-Params-Validate  perl-CPAN perl-Test-Mock-LWP.noarch perl-LWP-Authen-Negotiate.noarch perl-devel perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker
[root@master1 ~]# yum install perl-Time-HiRes -y
[root@master1 ~]# wget https://downloads.mariadb.com/files/MHA/mha4mysql-node-0.54-0.el6.noarch.rpm
[root@master1 ~]# rpm -ivh mha4mysql-node-0.54-0.el6.noarch.rpm

3.在manager上安装mha4mysql-manager和mha4mysql-node包

[root@master1 ~]# yum install perl perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Config-IniFiles  ncftp perl-Params-Validate  perl-CPAN perl-Test-Mock-LWP.noarch perl-LWP-Authen-Negotiate.noarch perl-devel perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker
[root@master1 ~]# yum install perl-Time-HiRes -y
[root@manager ~]# yum install perl cpan perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Net-Telnet -y
[root@manager ~]# wget http://rpmfind.net/linux/dag/redhat/el6/en/x86_64/dag/RPMS/perl-Log-Dispatch-2.26-1.el6.rf.noarch.rpm
[root@manager ~]# wget ftp://rpmfind.net/linux/dag/redhat/el6/en/x86_64/dag/RPMS/perl-Parallel-ForkManager-0.7.5-2.2.el6.rf.noarch.rpm
[root@manager ~]# wget ftp://rpmfind.net/linux/dag/redhat/el6/en/x86_64/dag/RPMS/perl-Mail-Sender-0.8.16-1.el6.rf.noarch.rpm
[root@manager ~]# wget ftp://rpmfind.net/linux/dag/redhat/el6/en/x86_64/dag/RPMS/perl-Mail-Sendmail-0.79-1.2.el6.rf.noarch.rpm
[root@manager ~]# yum localinstall *.rpm -y

安装manager和node包

wget https://downloads.mariadb.com/files/MHA/mha4mysql-manager-0.55-0.el6.noarch.rpm
rpm -ivh  https://downloads.mariadb.com/files/MHA/mha4mysql-manager-0.55-0.el6.noarch.rpm


4、查看mha4mysql-manager安装了哪些工具

[root@manager tools]# rpm -ql mha4mysql-manager |grep bin
/usr/bin/masterha_check_repl
/usr/bin/masterha_check_ssh
/usr/bin/masterha_check_status
/usr/bin/masterha_conf_host
/usr/bin/masterha_manager
/usr/bin/masterha_master_monitor
/usr/bin/masterha_master_switch
/usr/bin/masterha_secondary_check
/usr/bin/masterha_stop


5、manager主机上下载mha4mysql-manager的源码包

[root@manager ~]# wget https://downloads.mariadb.com/files/MHA/mha4mysql-manager-0.56.tar.gz
[root@manager ~]# mkdir -p /usr/local/mha/scripts
[root@manager ~]# cp mha4mysql-manager-0.56/samples/scripts/* /usr/local/mha/scripts/
[root@manager ~]# cp mha4mysql-manager-0.56/samples/conf/app1.cnf /usr/local/mha/mha.cnf
[root@manager ~]# tree /usr/local/mha/

6.修改manager端mha的配置文件,如下

[server default]
user=mha_rep
password=123456
ssh_user=root
repl_user=rep
repl_password=rep123
ping_interval=1
manager_workdir=/usr/local/mha
manager_log=/usr/local/mha/manager.log
secondary_check_script= masterha_secondary_check  -s 192.168.0.223 -s 192.168.0.224 -s 192.168.0.255
master_ip_failover_script=/usr/local/mha/scripts/master_ip_failover
#report_script= /usr/local/mha/scripts/send_report
#master_ip_online_change_script= /usr/local/mha/scripts/master_ip_online_change
#shutdown_script= /usr/local/mha/scripts/power_manager
[server1]
hostname=master1
ssh_port=22
candidate_master=1
master_binlog_dir=/mydata/bin_log/
[server2]
hostname=master2
ssh_port=22
candidate_master=1
master_binlog_dir=/mydata/bin_log/
[server3]
hostname=slave1
ssh_port=22
no_master=1
master_binlog_dir=/mydata/bin_log/
[server4]
hostname=slave2
ssh_port=22
no_master=1
master_binlog_dir=/mydata/bin_log/

编写master_ip_failover

mv  /usr/local/mha/scripts/master_ip_failover    /usr/local/mha/scripts/master_ip_failover.bak
vi  /usr/local/mha/scripts/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '192.168.0.99'; # Virtual IP
my $gateway = '192.168.0.1';#Gateway IP
my $interface = 'eth0';
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig $interface:$key $vip;/sbin/arping -I $interface -c 3 -s $vip $gateway >/dev/null 2>&1";
my $ssh_stop_vip = "/sbin/ifconfig $interface:$key down";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}
# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

7.检查ssh服务是否正常

[root@manager mha]# masterha_check_ssh --conf=/usr/local/mha/mha.cnf
Sat Nov 28 11:28:35 2015 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Nov 28 11:28:35 2015 - [info] Reading application default configurations from /usr/local/mha/mha.cnf..
Sat Nov 28 11:28:35 2015 - [info] Reading server configurations from /usr/local/mha/mha.cnf..
Sat Nov 28 11:28:35 2015 - [info] Starting SSH connection tests..
Sat Nov 28 11:28:38 2015 - [debug]
Sat Nov 28 11:28:35 2015 - [debug]  Connecting via SSH from root@master1(192.168.0.222:22) to root@master2(192.168.0.223:22)..
Sat Nov 28 11:28:36 2015 - [debug]   ok.
Sat Nov 28 11:28:36 2015 - [debug]  Connecting via SSH from root@master1(192.168.0.222:22) to root@slave1(192.168.0.224:22)..
Sat Nov 28 11:28:37 2015 - [debug]   ok.
Sat Nov 28 11:28:37 2015 - [debug]  Connecting via SSH from root@master1(192.168.0.222:22) to root@slave2(192.168.0.225:22)..
Sat Nov 28 11:28:38 2015 - [debug]   ok.
Sat Nov 28 11:28:39 2015 - [debug]
Sat Nov 28 11:28:36 2015 - [debug]  Connecting via SSH from root@master2(192.168.0.223:22) to root@master1(192.168.0.222:22)..
Sat Nov 28 11:28:36 2015 - [debug]   ok.
Sat Nov 28 11:28:36 2015 - [debug]  Connecting via SSH from root@master2(192.168.0.223:22) to root@slave1(192.168.0.224:22)..
Sat Nov 28 11:28:38 2015 - [debug]   ok.
Sat Nov 28 11:28:38 2015 - [debug]  Connecting via SSH from root@master2(192.168.0.223:22) to root@slave2(192.168.0.225:22)..
Sat Nov 28 11:28:39 2015 - [debug]   ok.
Sat Nov 28 11:28:40 2015 - [debug]
Sat Nov 28 11:28:36 2015 - [debug]  Connecting via SSH from root@slave1(192.168.0.224:22) to root@master1(192.168.0.222:22)..
Sat Nov 28 11:28:37 2015 - [debug]   ok.
Sat Nov 28 11:28:37 2015 - [debug]  Connecting via SSH from root@slave1(192.168.0.224:22) to root@master2(192.168.0.223:22)..
Sat Nov 28 11:28:39 2015 - [debug]   ok.
Sat Nov 28 11:28:39 2015 - [debug]  Connecting via SSH from root@slave1(192.168.0.224:22) to root@slave2(192.168.0.225:22)..
Sat Nov 28 11:28:40 2015 - [debug]   ok.
Sat Nov 28 11:28:40 2015 - [debug]
Sat Nov 28 11:28:37 2015 - [debug]  Connecting via SSH from root@slave2(192.168.0.225:22) to root@master1(192.168.0.222:22)..
Sat Nov 28 11:28:38 2015 - [debug]   ok.
Sat Nov 28 11:28:38 2015 - [debug]  Connecting via SSH from root@slave2(192.168.0.225:22) to root@master2(192.168.0.223:22)..
Sat Nov 28 11:28:39 2015 - [debug]   ok.
Sat Nov 28 11:28:39 2015 - [debug]  Connecting via SSH from root@slave2(192.168.0.225:22) to root@slave1(192.168.0.224:22)..
Sat Nov 28 11:28:40 2015 - [debug]   ok.
Sat Nov 28 11:28:40 2015 - [info] All SSH connection tests passed successfully.

如果得到以上结果,表明主机之间ssh互信是畅通的


8.检查主从复制服务是否正常

[root@manager mha]# masterha_check_repl --conf=/usr/local/mha/mha.cnf
Sun Nov 29 00:34:03 2015 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Nov 29 00:34:03 2015 - [info] Reading application default configurations from /usr/local/mha/mha.cnf..
Sun Nov 29 00:34:03 2015 - [info] Reading server configurations from /usr/local/mha/mha.cnf..
Sun Nov 29 00:34:03 2015 - [info] MHA::MasterMonitor version 0.55.
Sun Nov 29 00:34:03 2015 - [info] Dead Servers:
Sun Nov 29 00:34:03 2015 - [info] Alive Servers:
Sun Nov 29 00:34:03 2015 - [info]   master1(192.168.0.222:3306)
Sun Nov 29 00:34:03 2015 - [info]   master2(192.168.0.223:3306)
Sun Nov 29 00:34:03 2015 - [info]   slave1(192.168.0.224:3306)
Sun Nov 29 00:34:03 2015 - [info]   slave2(192.168.0.225:3306)
Sun Nov 29 00:34:03 2015 - [info] Alive Slaves:
Sun Nov 29 00:34:03 2015 - [info]   master2(192.168.0.223:3306)  Version=5.6.27-log (oldest major version between slaves) log-bin:enabled
Sun Nov 29 00:34:03 2015 - [info]     Replicating from 192.168.0.222(192.168.0.222:3306)
Sun Nov 29 00:34:03 2015 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Nov 29 00:34:03 2015 - [info]   slave1(192.168.0.224:3306)  Version=5.6.27-log (oldest major version between slaves) log-bin:enabled
Sun Nov 29 00:34:03 2015 - [info]     Replicating from 192.168.0.222(192.168.0.222:3306)
Sun Nov 29 00:34:03 2015 - [info]     Not candidate for the new Master (no_master is set)
Sun Nov 29 00:34:03 2015 - [info]   slave2(192.168.0.225:3306)  Version=5.6.27-log (oldest major version between slaves) log-bin:enabled
Sun Nov 29 00:34:03 2015 - [info]     Replicating from 192.168.0.222(192.168.0.222:3306)
Sun Nov 29 00:34:03 2015 - [info]     Not candidate for the new Master (no_master is set)
Sun Nov 29 00:34:03 2015 - [info] Current Alive Master: master1(192.168.0.222:3306)
Sun Nov 29 00:34:03 2015 - [info] Checking slave configurations..
Sun Nov 29 00:34:03 2015 - [info]  read_only=1 is not set on slave master2(192.168.0.223:3306).
Sun Nov 29 00:34:03 2015 - [warning]  relay_log_purge=0 is not set on slave master2(192.168.0.223:3306).
Sun Nov 29 00:34:03 2015 - [info]  read_only=1 is not set on slave slave1(192.168.0.224:3306).
Sun Nov 29 00:34:03 2015 - [warning]  relay_log_purge=0 is not set on slave slave1(192.168.0.224:3306).
Sun Nov 29 00:34:03 2015 - [info]  read_only=1 is not set on slave slave2(192.168.0.225:3306).
Sun Nov 29 00:34:03 2015 - [warning]  relay_log_purge=0 is not set on slave slave2(192.168.0.225:3306).
Sun Nov 29 00:34:03 2015 - [info] Checking replication filtering settings..
Sun Nov 29 00:34:03 2015 - [info]  binlog_do_db= , binlog_ignore_db=
Sun Nov 29 00:34:03 2015 - [info]  Replication filtering check ok.
Sun Nov 29 00:34:03 2015 - [info] Starting SSH connection tests..
Sun Nov 29 00:34:07 2015 - [info] All SSH connection tests passed successfully.
Sun Nov 29 00:34:07 2015 - [info] Checking MHA Node version..
Sun Nov 29 00:34:08 2015 - [info]  Version check ok.
Sun Nov 29 00:34:08 2015 - [info] Checking SSH publickey authentication settings on the current master..
Sun Nov 29 00:34:08 2015 - [info] HealthCheck: SSH to master1 is reachable.
Sun Nov 29 00:34:09 2015 - [info] Master MHA Node version is 0.54.
Sun Nov 29 00:34:09 2015 - [info] Checking recovery script configurations on the current master..
Sun Nov 29 00:34:09 2015 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/mydata/bin_log/ --output_file=/var/tmp/save_binary_logs_test --manager_version=0.55 --start_file=mysql-bin.000002
Sun Nov 29 00:34:09 2015 - [info]   Connecting to root@master1(master1)..
  Creating /var/tmp if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /mydata/bin_log/, up to mysql-bin.000002
Sun Nov 29 00:34:09 2015 - [info] Master setting check done.
Sun Nov 29 00:34:09 2015 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sun Nov 29 00:34:09 2015 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha_rep' --slave_host=master2 --slave_ip=192.168.0.223 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.27-log --manager_version=0.55 --relay_log_info=/mydata/data/relay-log.info  --relay_dir=/mydata/data/  --slave_pass=xxx
Sun Nov 29 00:34:09 2015 - [info]   Connecting to root@192.168.0.223(master2:22)..
Can't exec "mysqlbinlog": 没有那个文件或目录 at /usr/share/perl5/vendor_perl/MHA/BinlogManager.pm line 99.
mysqlbinlog version not found!
 at /usr/bin/apply_diff_relay_logs line 482
Sun Nov 29 00:34:09 2015 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln195] Slaves settings check failed!
Sun Nov 29 00:34:09 2015 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln375] Slave configuration failed.
Sun Nov 29 00:34:09 2015 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln386] Error happend on checking configurations.  at /usr/bin/masterha_check_repl line 48
Sun Nov 29 00:34:09 2015 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln482] Error happened on monitoring servers.
Sun Nov 29 00:34:09 2015 - [info] Got exit code 1 (Not master dead).
MySQL Replication Health is NOT OK!


报错解决:

ln  -s   /app/mysql/bin/mysql   /usr/bin/mysql
ln  -s   /app/mysql/bin/mysqlbinlog   /usr/bin/mysqlbinlog

四、mha实验

1、必须提前检测ssh连接和主从复制两项服务

[root@manager ~]# masterha_check_ssh --conf=/usr/local/mha/mha.cnf
[root@manager ~]# masterha_check_repl --conf=/usr/local/mha/mha.cnf
确定两条命令的返回结果都是无异常的,然后启动mha服务


2、启动manager服务器

[root@manager ~]# nohup masterha_manager --conf=/usr/local/mha/mha.cnf > /tmp/mha_manager.log 2>&1 &


3.测试流程

  (1)停掉主库,查看日志
  (2)查看备主机上得IP,看是否有VIP漂移
  (3)查看所有的从机. show  slave  status\G;


五:启动主的master,重新change master