1.服务器宕机

1.1 实验环境

主服务器:mysql3306

从服务器1:mysql3307

从服务器2:mysql3308

问题:主服务器宕机,切换3307为主库

我用的是多实例,模拟服务器宕机将3306端口数据库停掉。

1.2 查看两个从库同步的状态

(1)查看3307线程的状态

mysql> show processlist\G
*************************** 1. row ***************************
 Id: 1
   User: system user
   Host: 
 db: NULL
Command: Connect
   Time: 1857
  State: Waiting for master to send event
   Info: NULL
*************************** 2. row ***************************
 Id: 2
   User: system user
   Host: 
 db: NULL
Command: Connect
   Time: 1857
  State: Slave has read all relay log; waiting for the slave I/O thread to update it
   Info: NULL
*************************** 3. row ***************************
 Id: 3
   User: root
   Host: localhost
 db: NULL
Command: Query
   Time: 0
  State: NULL
   Info: show processlist
3 rows in set (0.00 sec)

(2)查看3308线程的状态

mysql> show processlist\G 
*************************** 1. row ***************************
 Id: 1
   User: system user
   Host: 
 db: NULL
Command: Connect
   Time: 1930
  State: Waiting for master to send event
   Info: NULL
*************************** 2. row ***************************
 Id: 2
   User: system user
   Host: 
 db: NULL
Command: Connect
   Time: 1930
  State: Slave has read all relay log; waiting for the slave I/O thread to update it
   Info: NULL
*************************** 3. row ***************************
 Id: 3
   User: root
   Host: localhost
 db: NULL
Command: Query
   Time: 0
  State: NULL
   Info: show processlist
3 rows in set (0.00 sec)

查看看两个线程的更新状态,确保所有relay log已经复制完毕,在每个从库上执行下面命令,直到看到has read all relay或者Slave has read all relay log

mysql> stop slave io_thread;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show processlist\G 

1.3 选择要切换的从库

分别查看从库master.info,将binlog位置最靠前更新最快POS最大作为主库的从库,切换为主库,这里将3307切换为主库。

[root@linzhongniao ~]# sed -n '2,3p' /data/3307/data/master.info
mysqlbin_linzhongniao.000039
107
[root@linzhongniao ~]# sed -n '2,3p' /data/3308/data/master.info  
mysqlbin_linzhongniao.000039
107

1.4 要切换的从库上的操作

(1)登录从库执行下面操作

mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)
mysql> reset master;
Query OK, 0 rows affected (0.00 sec)

mysql> quit; 

(2)删除master.info和relay-log.info

进入到要切换主库的数据库目录中,删除master.info和relay-log.info

[root@linzhongniao 3307]# pwd
/data/3307
[root@linzhongniao 3307]# rm -f relay-log.info 
[root@linzhongniao 3307]# cd data/
[root@linzhongniao data]# pwd
/data/3307/data
[root@linzhongniao data]# rm -f master.info 

(3)检查授权表和read-only等参数

3307的配置文件中做了授权表和read-log参数,将它们注释掉

[root@linzhongniao data]# egrep "read-only|replicate-wild" /data/3307/my.cnf
#read-only
#replicate-wild-ignore-table = mysql.%

(4)3307提升为主库

开启binlog

[root@linzhongniao data]# grep "log-bin" /data/3307/my.cnf   
log-bin = /data/3307/mysqlbin_linzhongniao

重启数据库服务

[root@linzhongniao data]# /data/3307/mysql restart
Restarting MySQL...
Stoping MySQL....
Starting MySQL......

至此从库3307提升为主库完毕

1.5 其他从库的操作

(1)查看3307主库master的状态,记录binlog和位置

mysql> show master status\G
*************************** 1. row ***************************
File: mysqlbin_linzhongniao.000002
Position: 107
Binlog_Do_DB: 
Binlog_Ignore_DB: 
1 row in set (0.00 sec)

(2)change master

将master.info和relay-log.info删除,重新change master

[root@linzhongniao 3308]# rm -f relay-log.info 
[root@linzhongniao 3308]# cd data/
[root@linzhongniao data]# rm -f master.info
mysql> CHANGE MASTER TO
 -> MASTER_HOST='192.168.136.113', 如果做域名解析了直接修改host文件即可
 -> MASTER_PORT=3306, 
 -> MASTER_USER='rep',
 -> MASTER_PASSWORD='123456',
 -> MASTER_LOG_FILE='mysqlbin_linzhongniao.000002',
 -> MASTER_LOG_POS=107;
Query OK, 0 rows affected (0.10 sec)

mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G
 *************************** 1. row ***************************
   Slave_IO_State: Connecting to master
  Master_Host: 192.168.136.113
  Master_User: rep
  Master_Port: 3306
Connect_Retry: 60
  Master_Log_File: mysqlbin_linzhongniao.000002
  Read_Master_Log_Pos: 107
   Relay_Log_File: relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: mysqlbin_linzhongniao.000002
 Slave_IO_Running: Connecting
Slave_SQL_Running: Yes
  Replicate_Do_DB: 
  Replicate_Ignore_DB: 
   Replicate_Do_Table: 
   Replicate_Ignore_Table: 
  Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
   Last_Errno: 0
   Last_Error: 
 Skip_Counter: 0
  Exec_Master_Log_Pos: 107
  Relay_Log_Space: 107
  Until_Condition: None
   Until_Log_File: 
Until_Log_Pos: 0
   Master_SSL_Allowed: No
   Master_SSL_CA_File: 
   Master_SSL_CA_Path: 
  Master_SSL_Cert: 
Master_SSL_Cipher: 
   Master_SSL_Key: 
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 2003
Last_IO_Error: error connecting to master 'rep@192.168.136.113:3306' - retry-time: 60  retries: 86400
   Last_SQL_Errno: 0
   Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
 Master_Server_Id: 1
1 row in set (0.00 sec)

报错了提示找不到3306这个数据库服务器的端口,原因是在change master的时候把MASTER_PORT的写的还是3306,这个地方也是故意这么写的,在其他数据库操作的时候时候授权的用户名密码是不用改的,其他的有可能要修改,这时不必删除master.info和relay-log.info文件,直接用change master to修改即可。之所以删除重新change master是不想产生不必要的麻烦。

mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)

mysql> change master to MASTER_PORT=3307;
Query OK, 0 rows affected (0.00 sec)

mysql> start slave;   
Query OK, 0 rows affected (0.00 sec)

至此主库宕机切换成功。

2.数据库宕机

主库宕机,服务器还能起来登录其他从库查看线程状态,确保数据数据复制完毕。如果没有同步完毕,登录每个仓库查看从库的主从复制状态,将没有同步的binlog数据备份,在从其中一个库补全把差距补回来(增量备份)。然后把这个从库作为主库

3.主库以外宕机

例如:我们有计划切换,怎么做呢

(1)登录所有从库查看同步的状态,如果都一样主库锁表,要切换主库的从库关闭同步,设置同步参数change master。

(2)登录所有的库查看同步状态,是否完成。

4.从库slave宕机

恢复方法:重做slave

(1)导入主库的数据

mysql –uroot –p123456 –S /data/3307/mysql.sock < test.sql &

(2)配置从库同步参数Change master

mysql> CHANGE MASTER TO
 -> MASTER_HOST='192.168.130.113',   主库的IP
 -> MASTER_PORT=3306, 主库的端口,从库端口可以和主库不同
 -> MASTER_USER='rep',主库上建立的用于复制的用户rep
 -> MASTER_PASSWORD='123456', 这里是rep的密码
 -> MASTER_LOG_FILE='mysql-bin.000036',这里是show master status;查看到的二进制日志文件名称注意不能有空格。
 -> MASTER_LOG_POS=335;这里也是show master status时看到的二进制日志偏移量注意不能多空格。

(3)开启从库同步开关,查看同步状态

start slave;
show slave status\G