MySQLReplication

整体上来说,复制有3个步骤: 
(1) master将改变记录到二进制日志binary log中;
(2) slave将master的binary log events拷贝到它的中继日志relay log;
(3) slave重做中继日志中的事件,将改变反映到它自己的数据。

异步复制:

•MySQL 5.5 版本之前的传统复制实现方式
•复制存在不可靠风险
•主-从机之间的数据库数据可能存在差异
•对数据一致性要求较高的场景,不是主推的方案

半同步复制:

•复制过程更可靠
•保证了数据一致性
•生产环境的首选方案
•半同步复制过程中:如从机遇异常并未能及时响应客户端一个同步事务;则超过系统设定的超时设置后,自动降级成异步方式。

GTID简介
•GTID(Global Transaction ID)是对于一个已提交事务的编号,并且是一个全局唯一的编号。
•GTID实际上是由UUID+TID组成的。其中UUID是一个MySQL实例的唯一标识。TID代表了该实例上已经提交的事务数量,并且随着事务提交单调递增。
•GTID的具体形式:3E11FA47-71CA-11E1-9E33-C80AA9429562:23

GTID的作用
•根据GTID可以知道事务最初是在哪个实例上提交的
•GTID的存在方便了Replication的Failover

GTID的原理:
    从服务器连接到主服务器之后,把自己执行过的GTID(Executed_Gtid_Set) 、获取到的GTID(Retrieved_Gtid_Set)发给主服务器,主服务器把从服务器缺少的GTID及对应的transactions发过去补全即可。当主服务器挂掉的时候,找出同步最成功的那台从服务器,直接把它提升为主即可。如果硬要指定某一台不是最新的从服务器提升为主,先change到同步最成功的那台从服务器,等把GTID全部补全了,就可以把它提升为主了。

GTID复制问题处理方案:

CASE1:
Last_IO_Error:Got fatal error 1236 from master when reading data from binary log: 'The slaveis connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the masterhas purged binary logs containing GTIDs that the slave requires.'

解决方案:

1、mysql> STOP SLAVE;

2、mysql> SHOW SLAVESTATUS\G

3、mysql> RESET MASTER;

4、mysql> SET GLOBALgtid_purged=':1-'

5、mysql> START SLAVE;

 

CASE2:

Last_Error:Could not execute Delete_rows event on table dos.dds_stat_channel_task_info1;Can't find record in 'dds_stat_channel_task_info1', Error_code: 1032; handlererror HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.001772,end_log_pos 103698201

Last_Error:Could not execute Write_rows event on table ses.pases_searchlog; Duplicate entry'EF8E613E74B97E41E043A21FDBF' for key 'PRIMARY', Error_code: 1062; handlererror HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000018,end_log_pos 109177336

1.    检查具体的报错是否是如上描述,Slave_SQL_Running: NO,Last_SQL_Error为具体的报错。

2.    show slave status\G

Relay_Log_File: mysql-relay.000016

Relay_Log_Pos: 20593960

Executed_Gtid_Set: ae498b08-b570-11e3-a1ff-18a905565190:45

通过带入上面的两个值,然后执行如下命令,从result.sql里面获取具体的报错语句。

mysqlbinlog--no-defaults -v -v --base64-output=DECODE-ROWS --start-position=20593960  mysql-relay.000016 --result-file=result.sql

分析具体的语句,必要时可以查看主库,验证对应的记录是否存在。

确认是否可以跳过该事物,如果主从数据不一致导致,则需要联系开发运营确认。

如果确认可以跳过,则执行下一步跳过该事物。

3.    跳过异常事物

mysql>stop slave;

mysql>set gtid_next='ae498b08-b570-11e3-a1ff-18a905565190:46';           ---开启一个空事务,需要45+1,用于跳过上面的报错。

mysql>BEGIN;

mysql>COMMIT;

mysql>SET GTID_NEXT='AUTOMATIC';    

mysql>start slave;

mysql>show slave status\G           

验证是否恢复正常。

 

CASE3:

Last_Error:Could not execute Update_rows event on table freeswitch.sip_registrations;Can't find record in 'sip_registrations', Error_code: 1032; handler errorHA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin-m.000002, end_log_pos279519

1、mysql> STOP SLAVE;

2、mysql> SHOW SLAVESTATUS\G

3、mysql> SETsql_log_bin = 0;

4、mysql> INSERT INTOtable….(使用mysqlbinlog工具从binlog中提取数据,拼凑成SQL语句在从库插入)

5、mysql> commit;

6、mysql> SETsql_log_bin = 1;

7、mysql> start slave;

CASE4:

Last_Error:Got fatal error 1236 from master when reading data from binary log: 'event readfrom binlog did not pass crc check; the first event '' at 4, the last eventread from '/paic/mylog/3322/mysql-bin.007461' at 173388123, the last byte readfrom '/paic/mylog/3322/mysql-bin.007461' at 173396307.'

补充知识:

How to AvoidReplicating a Statement to the Slave

1.UseReplication Filters

binlog-ignore-db = local

2.Settingthe sql_log_bin Variable

SET sql_log_bin = 0;
INSERT INTO table1 ...
SET sql_log_bin = 1;

3.Use aLOCAL Variant of the Statement

ANALYZE TABLE ...

FLUSH ...

OPTIMIZE TABLE ...

REPAIR TABLE ...

CASE5:
Last_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'

到主库找到正确的binlog文件,在从库上停止复制,修改正确之后再启动复制。
First check that the file name given as master_log_file is correct. Look for spelling errors, extra white space, or missing a character at the beginning or end while copy/pasting. If you find that the file name is incorrect:
 STOP SLAVE;
 CHANGE MASTER TO master_log_file='correctFilename';
 START SLAVE;
 If the file name is correct, check the master's data directory. Make sure that the file exists. If it does not, it may be necessary to recreate the slave with a fresh copy of the master and then resume replication with a file that does exist.