最近平台要做迁移,由于业务迁移过程需求,希望slave在平台迁移过程中,既能保证master的数据同步到slave,但是也希望slave可以update.
这个需求,作为DBA,单单从数据库层面,基本可以判断,是不可行的.但是,实际测试,才有说服力.用数据/实验结果说话。

开始着手搭建一个测试的主从,进行slave的update测试.

  1. 将slave的read_only设置为on,让slave是可以更改的.这里需要注意.read_only这个参数.对于root或者拥有(.) 全instance权限的用户,是不生效的.
  2. 主库创建一个测试表test2
  3. 在slave上,可以看到test2已经同步过来
    查询表行数

    mysql> select count(*) from test2;
    +----------+
    | count(*) |
    +----------+
    | 23502 |
    +----------+
    1 row in set (0.01 sec)
  4. 删除10行

    mysql> delete from test2 where id<10;
    Query OK, 9 rows affected (0.04 sec)
  5. 再次查询表行数
    mysql> select count(*) from test2;
    +----------+
    | count(*) |
    +----------+
    | 23493 |
    +----------+
    1 row in set (0.01 sec)
  6. 在主库,修改表数据
    mysql> select count(*) from test2;
    +----------+
    | count(*) |
    +----------+
    | 23502 |
    +----------+
    1 row in set (0.01 sec)
  7. 删除5行数据
    mysql> delete from test2 where id<5;
    Query OK, 4 rows affected (0.04 sec)
  8. 查询行数
    mysql> select count(*) from test2;
    +----------+
    | count(*) |
    +----------+
    | 23498 |
    +----------+
    1 row in set (0.01 sec)
  9. 再到备库,查看同步状态

    mysql> show slave status\G
    *************************** 1. row ***************************
    Slave_IO_State: Waiting for master to send event
    Master_Host: 10.27.20.4
    Master_User: envision
    Master_Port: 3306
    Connect_Retry: 60
    Master_Log_File: mysql-bin.000043
    Read_Master_Log_Pos: 57380204
    Relay_Log_File: replay-bin.000003
    Relay_Log_Pos: 320
    Relay_Master_Log_File: mysql-bin.000043
    Slave_IO_Running: Yes
    Slave_SQL_Running: No
    Replicate_Do_DB:
    Replicate_Ignore_DB:
    Replicate_Do_Table:
    Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
    Replicate_Wild_Ignore_Table:
    Last_Errno: 1032
    Last_Error: Could not execute Delete_rows event on table cnpmjs.test2; Can't find record in 'test2', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000043, end_log_pos 57380173
    Skip_Counter: 0
    Exec_Master_Log_Pos: 57375790
    Relay_Log_Space: 13539720
    Until_Condition: None
    Until_Log_File:
    Until_Log_Pos: 0
    Master_SSL_Allowed: No
    Master_SSL_CA_File:
    Master_SSL_CA_Path:
    Master_SSL_Cert:
    Master_SSL_Cipher:
    Master_SSL_Key:
    Seconds_Behind_Master: NULL
    Master_SSL_Verify_Server_Cert: No
    Last_IO_Errno: 0
    Last_IO_Error:
    Last_SQL_Errno: 1032
    Last_SQL_Error: Could not execute Delete_rows event on table cnpmjs.test2; Can't find record in 'test2', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000043, end_log_pos 57380173
    Replicate_Ignore_Server_Ids:
    Master_Server_Id: 102
    Master_UUID: b095e989-7dcf-11e8-83a2-0017fa032e39
    Master_Info_File: /var/lib/mysql/master.info
    SQL_Delay: 0
    SQL_Remaining_Delay: NULL
    Slave_SQL_Running_State:
    Master_Retry_Count: 86400
    Master_Bind:
    Last_IO_Error_Timestamp:
    Last_SQL_Error_Timestamp: 180702 14:54:00
    Master_SSL_Crl:
    Master_SSL_Crlpath:
    Retrieved_Gtid_Set:
    Executed_Gtid_Set:
    Auto_Position: 0
    Replicate_Rewrite_DB:
    Channel_Name:
    Master_TLS_Version:
    1 row in set (0.00 sec)

    上面错误,已经说明,主从的数据已经不一致,无法继续同步

  10. 在slave上修复数据,插入缺少的10行
    mysql> insert into test2 select * from user where id<10;
    Query OK, 9 rows affected (0.01 sec)
    Records: 9 Duplicates: 0 Warnings: 0
  11. 修改完成之后,重启slave进程
    mysql> stop slave;
    Query OK, 0 rows affected (0.00 sec)
    mysql> start slave;
    Query OK, 0 rows affected (0.01 sec)
  12. 再次查询,可以看到,slave又继续同步了.
    mysql> show slave status\G
    *************************** 1. row ***************************
    Slave_IO_State: Waiting for master to send event
    Master_Host: 10.27.20.4
    Master_User: envision
    Master_Port: 3306
    Connect_Retry: 60
    Master_Log_File: mysql-bin.000043
    Read_Master_Log_Pos: 57380204
    Relay_Log_File: replay-bin.000005
    Relay_Log_Pos: 320
    Relay_Master_Log_File: mysql-bin.000043
    Slave_IO_Running: Yes
    Slave_SQL_Running: Yes
    Replicate_Do_DB:
    Replicate_Ignore_DB:
    Replicate_Do_Table:
    Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
    Replicate_Wild_Ignore_Table:
    Last_Errno: 0
    Last_Error:
    Skip_Counter: 0
    Exec_Master_Log_Pos: 57380204
    Relay_Log_Space: 688
    Until_Condition: None
    Until_Log_File:
    Until_Log_Pos: 0
    Master_SSL_Allowed: No
    Master_SSL_CA_File:
    Master_SSL_CA_Path:
    Master_SSL_Cert:
    Master_SSL_Cipher:
    Master_SSL_Key:
    Seconds_Behind_Master: 0
    Master_SSL_Verify_Server_Cert: No
    Last_IO_Errno: 0
    Last_IO_Error:
    Last_SQL_Errno: 0
    Last_SQL_Error:
    Replicate_Ignore_Server_Ids:
    Master_Server_Id: 102
    Master_UUID: b095e989-7dcf-11e8-83a2-0017fa032e39
    Master_Info_File: /var/lib/mysql/master.info
    SQL_Delay: 0
    SQL_Remaining_Delay: NULL
    Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
    Master_Retry_Count: 86400
    Master_Bind:
    Last_IO_Error_Timestamp:
    Last_SQL_Error_Timestamp:
    Master_SSL_Crl:
    Master_SSL_Crlpath:
    Retrieved_Gtid_Set:
    Executed_Gtid_Set:
    Auto_Position: 0
    Replicate_Rewrite_DB:
    Channel_Name:
    Master_TLS_Version:
    1 row in set (0.00 sec)
  13. 查询数据,也是主库删除5条之后的信息.
    mysql> select count(*) from test2;
    +----------+
    | count(*) |
    +----------+
    | 23498 |
    +----------+
    1 row in set (0.01 sec)
  14. 再次更新master,看主从同步情况
    mysql> insert into test2 select * from test2;
    Query OK, 23498 rows affected (1.01 sec)
    Records: 23498 Duplicates: 0 Warnings: 0
  15. 查询主库表的行数
    mysql> select count(*) from test2;
    +----------+
    | count(*) |
    +----------+
    | 46996 |
    +----------+
    1 row in set (0.02 sec)
  16. 从库查询,同步正常执行.
    mysql> select count(*) from test2;
    +----------+
    | count(*) |
    +----------+
    | 46996 |
    +----------+
    1 row in set (0.02 sec)

    前面测试的是delete,现在测试update是否可行

  17. 在备库,修改数据
    mysql> update test2 set json=0 where id=5;
    Query OK, 2 rows affected (0.08 sec)
    Rows matched: 2 Changed: 2 Warnings: 0
    mysql> select * from test2 where id=5\G;
    *************************** 1. row ***************************
    id: 5
    gmt_create: 2016-02-23 12:28:51
    gmt_modified: 2016-02-23 12:28:51
    name: m_gol
    salt: 0
    password_sha: 0
    ip: 0
    roles: []
    rev: 2-379c3d7dfc06312105072ec0ccf84b4a
    email: m.goleb@gmail.com
    json: 0
    npm_user: 1
  18. 在主库,修改数据
    mysql> update test2 set json=1000 where id=5;
    Query OK, 0 rows affected (0.07 sec)
    Rows matched: 2 Changed: 0 Warnings: 0
    mysql> select * from test2 where id=5\G;
    *************************** 1. row ***************************
    id: 5
    gmt_create: 2016-02-23 12:28:51
    gmt_modified: 2016-02-23 12:28:51
    name: m_gol
    salt: 0
    password_sha: 0
    ip: 0
    roles: []
    rev: 2-379c3d7dfc06312105072ec0ccf84b4a
    email: m.goleb@gmail.com
    json: 1000
    npm_user: 1
  19. 在备库查询slave同步状态
    mysql> show slave status\G
    *************************** 1. row ***************************
    Slave_IO_State: Waiting for master to send event
    Master_Host: 10.27.20.4
    Master_User: envision
    Master_Port: 3306
    Connect_Retry: 60
    Master_Log_File: mysql-bin.000043
    Read_Master_Log_Pos: 70913715
    Relay_Log_File: replay-bin.000005
    Relay_Log_Pos: 13529969
    Relay_Master_Log_File: mysql-bin.000043
    Slave_IO_Running: Yes
    Slave_SQL_Running: No
    Replicate_Do_DB:
    Replicate_Ignore_DB:
    Replicate_Do_Table:
    Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
    Replicate_Wild_Ignore_Table:
    Last_Errno: 1032
    Last_Error: Could not execute Update_rows event on table cnpmjs.test2; Can't find record in 'test2', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000043, end_log_pos 70912987
    Skip_Counter: 0
    Exec_Master_Log_Pos: 70909853
    Relay_Log_Space: 13534935
    Until_Condition: None
    Until_Log_File:
    Until_Log_Pos: 0
    Master_SSL_Allowed: No
    Master_SSL_CA_File:
    Master_SSL_CA_Path:
    Master_SSL_Cert:
    Master_SSL_Cipher:
    Master_SSL_Key:
    Seconds_Behind_Master: NULL
    Master_SSL_Verify_Server_Cert: No
    Last_IO_Errno: 0
    Last_IO_Error:
    Last_SQL_Errno: 1032
    Last_SQL_Error: Could not execute Update_rows event on table cnpmjs.test2; Can't find record in 'test2', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000043, end_log_pos 70912987
    Replicate_Ignore_Server_Ids:
    Master_Server_Id: 102
    Master_UUID: b095e989-7dcf-11e8-83a2-0017fa032e39
    Master_Info_File: /var/lib/mysql/master.info
    SQL_Delay: 0
    SQL_Remaining_Delay: NULL
    Slave_SQL_Running_State:
    Master_Retry_Count: 86400
    Master_Bind:
    Last_IO_Error_Timestamp:
    Last_SQL_Error_Timestamp: 180702 15:23:42
    Master_SSL_Crl:
    Master_SSL_Crlpath:
    Retrieved_Gtid_Set:
    Executed_Gtid_Set:
    Auto_Position: 0
    Replicate_Rewrite_DB:
    Channel_Name:
    Master_TLS_Version:
    1 row in set (0.00 sec)

    总结, slave是不能修改的,尤其是在相同的表,和相同的行,如果出现查找的行不存在.就会导致同步失败.
    即使,不是delete,只是update,也是不可以的.
    所以slave还是需要以read_only的形式存在. 才能保证组从一直正常同步.

原因是,binglog中,是记录行信息的上下文的,如果上下文对不上了(在slave端被修改过),接下来的binglog的recover就不能正确找到需要继续还原的信息,也就不能成功执行recover,进而导致slave同步失败。

参考下面一段binlog的截取信息:

 mysqlbinlog mysql-bin.000005 >> test.log
#181126  9:53:49 server id 101  end_log_pos 8342676 CRC32 0x928e28de    Anonymous_GTID  last_committed=2436 sequence_number=2437    rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
# at 8342676
#181126  9:53:49 server id 101  end_log_pos 8342758 CRC32 0x1068a385    Query   thread_id=58717 exec_time=0 error_code=0
SET TIMESTAMP=1543226029/*!*/;
BEGIN
/*!*/;
# at 8342758
#181126  9:53:49 server id 101  end_log_pos 8342849 CRC32 0x2ee04d35    Table_map: `test`.`app_purchases` mapped to number 290
# at 8342849
#181126  9:53:49 server id 101  end_log_pos 8343168 CRC32 0x598ec3d0    Update_rows: table id 290 flags: STMT_END_F

BINLOG '
rcL7WxNlAAAAWwAAAEFNfwAAACIBAAAAAAEADmVvc19wb3J0YWxfd2ViAA1hcHBfcHVyY2hhc2Vz
AAsPDw8PAQ8SEhESAQ7AAMAAwADAAP0CAAAAAPgHNU3gLg==
rcL7Wx9lAAAAPwEAAIBOfwAAACIBAAAAAAEAAgAL/////2D7JDBlM2FmNzJkLWIxNDctNGQ3My1h
ZjY5LTBhNDQxZmZmMTdiMSQzN2Q2ZDJjZS1hZGU2LTRhNzEtOTEzYS1lNDJhMTQ5NmQxMDAPbzE1
NDI1MjM2NjI4NDcxJDM3ZDZkMmNlLWFkZTYtNGE3MWQ5ZDc1NWUyLWJmYTQtNDA3ZgKZoXSdawBg
+CQwZTNhZjcyZC1iMTQ3LTRkNzMtYWY2OS0wYTQ0MWZmZjE3YjEkMzdkNmQyY2UtYWRlNi00YTcx
LTkxM2EtZTQyYTE0OTZkMTAwD28xNTQyNTIzNjYyODQ3MSQzN2Q2ZDJjZS1hZGU2LTRhNzFkOWQ3
NTVlMi1iZmE0LTQwN2YCmaF0nWtb+8KtmaF0nXEB0MOOWQ==
'/*!*/;
# at 8343168
#181126  9:53:49 server id 101  end_log_pos 8343199 CRC32 0xcb9106dc    Xid = 400668
COMMIT/*!*/;
# at 8343199