ERROR日志切换方法

MySQL官方说如果要重命名错误日志文件,请在刷新之前手动执行此操作。 然后执行FLUSH ERROR LOGS;

If you flush the error log using FLUSH ERROR LOGS, FLUSH LOGS, or mysqladmin flush-logs, the server closes and reopens any error log file to which it is writing. To rename an error log file, do so manually before flushing. Flushing the logs then opens a new file with the original file name. For example, assuming a log file name of _`host_name`_.err, to rename the file and create a new one, use the following commands:

mv host_name.err host_name.err-old
mysqladmin flush-logs
mv host_name.err-old backup-directory

批量进行error日志切换

生产上有78套一主两从的库,error日志已经跑了1年半左右时间,error日志很大,排查问题极不方便,使用ansible批量进行error日志的切换,将脚本传入到每一台主机(主从都传了)

首先对日志进行mv重命名,然后登录mysql执行flush error logs

#!/bin/bashMYSQL_USER=cjr
MYSQL_PASSWD_EN='Y2pyCg=='
MYSQL_PASSWD=`echo $MYSQL_PASSWD_EN |base64 -d`


GET_ERRORLOGPATH()
{
      ERROPATH=`ps -ef | grep mysqld | grep -v "grep" |awk  '{for(i=1;i<=NF;i++){print $i;}}' | grep log-error`
}

RENAME()
{
GET_ERRORLOGPATH
        for errlog in $ERROPATH
        do
           echo $errlog
           logpath=`echo $errlog |  awk -F '=' '{print $2}'`
           echo $logpath
                   find $logpath  -exec mv {} {}_`date +%Y%m%d` \;  
        done
}

FLUSH_LOG(){
           sockets=`ps -ef | grep mysqld | grep -v "grep" |awk  '{for(i=1;i<=NF;i++){print $i;}}' | grep socket`
                for sockpath in $sockets
                do
                        echo $sockpath
                        sock=`echo $sockpath |  awk -F '=' '{print $2}'`
                        mysql -u$MYSQL_USER -p$MYSQL_PASSWD -S$sock  -e "FLUSH ERROR LOGS;"
        done
}

MAIN(){
RENAME
FLUSH_LOG
}

FLUSH的一些说明

MySQL官网对flush的说明如下,说,进行flush logs ,flush binary logs,flush tables with read lock等操作是不写binlog的。

By default, the server writes FLUSH statements to the binary log so
that they replicate to replicas. To suppress logging, specify the
optional NO_WRITE_TO_BINLOG keyword or its alias LOCAL.

*Note*:

FLUSH LOGS, FLUSH BINARY LOGS, FLUSH TABLES WITH READ LOCK (with or
without a table list), and FLUSH TABLES tbl_name ... FOR EXPORT are not
written to the binary log in any case because they would cause problems
if replicated to a replica.

也就是说,其他的flush 操作是需要写binlog的

测试验证

批量切换error日志的问题

由于操作时,是按照flush error logs 不写binlog的思路进行脚本编写的,实际上flush error logs是要写binlog日志的,

这就导致主从的事务会不一致,也就是主库上的flush error logs传到了从库,而另外两个从库也分别执行的flush error logs,这会导致之后的主从切换会因为gtid的不一致而失败。

在某个一主两次执行

show slave hosts;
select @@server_uuid;
show master status\G

图1

图2

图3

mysql> show slave hosts;  
Empty set (0.00 sec)

mysql> select @@server_uuid;  
+--------------------------------------+
| @@server_uuid                        |
+--------------------------------------+
| 46f5d354-7fda-11ea-9334-5098b8d2d730 |
+--------------------------------------+
1 row in set (0.00 sec)

mysql> show master status\G 
*************************** 1. row ***************************
             File: mysql-bin.002259
         Position: 607268630
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 25ae2df5-7fda-11ea-8812-5098b8d22110:1-31694640,
46f5d354-7fda-11ea-9334-5098b8d2d730:1-15041,
68aa38a2-7fda-11ea-92ca-5098b8d28722:1-320755280
1 row in set (0.00 sec)
mysql>   exit; 
------------------------------------------------------------------------
mysql> show slave hosts;  
Empty set (0.00 sec)

mysql> select @@server_uuid;  
+--------------------------------------+
| @@server_uuid                        |
+--------------------------------------+
| 25ae2df5-7fda-11ea-8812-5098b8d22110 |
+--------------------------------------+
1 row in set (0.00 sec)

mysql> show master status\G 
*************************** 1. row ***************************
             File: mysql-bin.002255
         Position: 610321753
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 25ae2df5-7fda-11ea-8812-5098b8d22110:1-31694641,
46f5d354-7fda-11ea-9334-5098b8d2d730:1-15040,
68aa38a2-7fda-11ea-92ca-5098b8d28722:1-320757139
1 row in set (0.00 sec)

mysql>   exit; ------------------------------------------------------------------------
mysql> show slave hosts;  
+------------+-----------------------+------+-----------+--------------------------------------+
| Server_id  | Host                  | Port | Master_id | Slave_UUID                           |
+------------+-----------------------+------+-----------+--------------------------------------+
| 3809411732 | ** .** .** .**        | 8802 |  65748802 |                                      |
|   65728802 | ** .** .** .**        | 8802 |  65748802 | 46f5d354-7fda-11ea-9334-5098b8d2d730 |
|   65738802 | ** .** .** .**        | 8802 |  65748802 | 25ae2df5-7fda-11ea-8812-5098b8d22110 |
+------------+-----------------------+------+-----------+--------------------------------------+
3 rows in set (0.00 sec)

mysql> select @@server_uuid;  
+--------------------------------------+
| @@server_uuid                        |
+--------------------------------------+
| 68aa38a2-7fda-11ea-92ca-5098b8d28722 |
+--------------------------------------+
1 row in set (0.00 sec)

mysql> show master status\G 
*************************** 1. row ***************************
             File: mysql-bin.002258
         Position: 681441868
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 25ae2df5-7fda-11ea-8812-5098b8d22110:1-31694640,
46f5d354-7fda-11ea-9334-5098b8d2d730:1-15040,
68aa38a2-7fda-11ea-92ca-5098b8d28722:1-320757753
1 row in set (0.00 sec)

可以看到图3的库是主库,图1,图2是从库

主库的UUID

68aa38a2-7fda-11ea-92ca-5098b8d28722

从库1(图1)UUID

46f5d354-7fda-11ea-9334-5098b8d2d730 从库1

其Gtid_set为46f5d354-7fda-11ea-9334-5098b8d2d730:1-15041,

比另外两个Gtid_Set多一个事务46f5d354-7fda-11ea-9334-5098b8d2d730:1-15040

从库2(图2)UUID

25ae2df5-7fda-11ea-8812-5098b8d22110,从库2

其Gtid_set为25ae2df5-7fda-11ea-8812-5098b8d22110:1-31694641,

比另外两个Gtid_Set多一个事务25ae2df5-7fda-11ea-8812-5098b8d22110:1-31694640

Gtid修复

通过跳过空事务的方法使gtid一致,为避免出错,可以分别在每个库(主从)上执行如下

set sql_log_bin=off;
SET GTID_NEXT='46f5d354-7fda-11ea-9334-5098b8d2d730:15041';
BEGIN;

COMMIT;

SET GTID_NEXT='25ae2df5-7fda-11ea-8812-5098b8d22110:31694641';
BEGIN;

COMMIT;
SET GTID_NEXT='AUTOMATIC';

总结

flush error logs会导致写binlog,如果在主从都执行了flush error logs 会使主从gtid不一致。

解决办法可以是批量mv error log后,在主库执行flush error logs 或者在批量脚本中,将sql_log_bin关闭后 set sql_log_bin=off;flush error logs;则可以在主从都执行。