环境:

192.168.205.17: as master server
192.168.205.27: as slave server
192.168.205.37: as slave server

版本:

OS: centos 7 1810 with mini install
mariadb-5.5.60

目地:

当数据库运行了一段时间后,如何再做主从复制,并当主服务器down机时,如何提升从为主

步骤:

1. 配置主服务器
2. 备份主服务器
3. 配置从服务器
4. 将备份恢复从服务器
5. 测试主从复制
6. 模拟主服务器down机,提升从节点
7. 测试

主服务器192.168.205.17

  1. 首先安装MariaDB
[root@Master ~]#yum install mariadb-server
  1. 分创建数据目录和logs目录
[root@Master ~]#mkdir /data/{mysql,logs}  
[root@Master ~]#chown -R mysql:mysql /data/{mysql,logs}  
  1. 修改配置文件
[root@Master ~]#vi /etc/my.cnf  
[mysqld]  
server_id=17  #服务器ID必须唯一  
datadir=/data/mysql  #数据文件路径  
log_bin=/data/logs/bin  #日志文件路径  
[root@Master ~]#systemctl restart mariadb  
  1. 建立复制帐号,此帐号是从服务器与主服务器同步时连接用的帐号,因为对所有数据库同步,所以是*.*
MariaDB [(none)]> GRANT replication slave on *.* to repluser@'192.168.205.%' identified by 'centos';  
  1. 备份数据库,因为你的数据库运行了一段时间,为了把以前的数据也要同步过去,所以先备份主的服务器数据,再恢复到从的节点上,其中--master-data=1表示是从从节点上恢复数据,在备份的sql中会启用change master to语句
[root@Master ~]#mysqldump -A --single-transaction --master-data=1 -F >/data/all.mysql  
  1. 将备份的文件复制到slave服务器上
[root@Master ~]#scp /data/all.mysql 192.168.205.27:/data/  

从服务器192.168.205.27

  1. 安装MariaDB在slave服务器上
[root@slave ~]#yum install mariadb-server
  1. 修改配置文件
[root@slave ~]#vi /etc/my.cnf 
[mysqld]  
datadir=/data/mysql  
read-only  #只给slave数据只读权限,当然只能限制普通帐号  
log-bin=/data/logs/bin  
server-id=27  #修改server-id一样和主不一样才行
  1. 创建数据和日志文件夹并更改所有者和所有组为mysql
[root@slave ~]#mkdir /data/{mysql,logs}  
[root@slave ~]#chown mysql:mysql /data/{mysql,logs}   
  1. 在slave服务器中打开备份的文件,添加如下的内容
[root@salve data]#vi all.mysql  
CHANGE MASTER TO  
MASTER_HOST='192.168.205.17',  
MASTER_USER='repluser',  
MASTER_PASSWORD='centos',  
MASTER_LOG_FILE='bin.000005',  
MASTER_LOG_POS=245; 
  1. 启动服务
[root@slave ~]#systemctl restart mariadb  
  1. 直接恢恢复数据库
    [root@slave ~]#mysql < /data/all.mysql
  2. 连接mariaDB并查看状态
[root@slave ~]#mysql  
MariaDB [(none)]> show variables like 'server_id';  
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id     | 27    |
+---------------+-------+
1 row in set (0.00 sec)
MariaDB [(none)]> show variables like 'read_only';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only     | ON    |
+---------------+-------+
1 row in set (0.00 sec)
MariaDB [(none)]> show slave status\G
...
             Slave_IO_Running: No
            Slave_SQL_Running: No
...
  1. 起动slave I/O thread 和slave SQL thread线程, 并查看状态
MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> show slave status\G
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.205.17
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: bin.000005
          Read_Master_Log_Pos: 402
               Relay_Log_File: mariadb-relay-bin.000003
                Relay_Log_Pos: 523
        Relay_Master_Log_File: bin.000005
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
  1. 查看数据库是否同步过来
MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hellodb            |
| mysql              |
| performance_schema |
| test               |
| zhaoli             |
+--------------------+
6 rows in set (0.00 sec)
  1. 查看网络连接,已经连接到主服务器的3306
[root@slave data]#ss -nt
State      Recv-Q Send-Q               Local Address:Port                              Peer Address:Port              
ESTAB      0      96                  192.168.205.27:22                               192.168.205.1:17526              
ESTAB      0      0                   192.168.205.27:56360                           192.168.205.17:3306 

测试

  1. 在主的服务器上,建立一个数据库,并测试数据库同步
MariaDB [(none)]> create database db1;
Query OK, 1 row affected (0.00 sec)
  1. 在从服务器上看有没有数据库db1
MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| db1                |
| hellodb            |
| mysql              |
| performance_schema |
| test               |
| zhaoli             |
+--------------------+
7 rows in set (0.00 sec)
  1. 如果在主和从服务器上同时创建了一个数据库db2,会引起线程停止,复制也会停止 主服务器:
MariaDB [(none)]> create database db2;
Query OK, 1 row affected (0.00 sec)

从服务器:

MariaDB [(none)]> create database db2;
Query OK, 1 row affected (0.00 sec)

此时在从服务器上查看状态:

MariaDB [(none)]> show slave status\G
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
               Last_SQL_Errno: 1007
                   Last_Error: Error 'Can't create database 'db2'; database exists' on query. Default database: 'db2'. Query: 'create database db2'
...  

此时在主服务器中再建立数据库db3:

MariaDB [(none)]> create database db3;
Query OK, 1 row affected (0.00 sec)  

再从服务器上看没有同步:

MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| db1                |
| db2                |
| hellodb            |
| mysql              |
| performance_schema |
| test               |
| zhaoli             |
+--------------------+
8 rows in set (0.00 sec)
  1. 我们可以在从服务器上删除数据库db2,来解决同步冲突的问题,但错误还在
MariaDB [hellodb]> drop database db2;
Query OK, 0 rows affected (0.00 sec)
  1. 我们必须要重启slave进程才能继续复制 从服务器上重启线程:
MariaDB [hellodb]> stop slave;
Query OK, 0 rows affected (0.00 sec)
MariaDB [hellodb]> start slave;
Query OK, 0 rows affected (0.00 sec)
MariaDB [hellodb]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| db1                |
| db2                |
| db3                | 
| hellodb            |
| mysql              |
| performance_schema |
| test               |
| zhaoli             |
+--------------------+
9 rows in set (0.00 sec)
  1. 同样也们也可以对表进行操作,在主和从的相同的库和表中插入一个条主键相同记录 从服务器
    MariaDB [(none)]> use hellodb;
    MariaDB [hellodb]> show tables;
    +-------------------+
    | Tables_in_hellodb |
    +-------------------+
    | classes           |
    | coc               |
    | courses           |
    | scores            |
    | students          |
    | teachers          |
    | toc               |
    +-------------------+
    7 rows in set (0.00 sec)
    
    MariaDB [hellodb]> desc teachers;
    +--------+----------------------+------+-----+---------+----------------+
    | Field  | Type                 | Null | Key | Default | Extra          |
    +--------+----------------------+------+-----+---------+----------------+
    | TID    | smallint(5) unsigned | NO   | PRI | NULL    | auto_increment |
    | Name   | varchar(100)         | NO   |     | NULL    |                |
    | Age    | tinyint(3) unsigned  | NO   |     | NULL    |                |
    | Gender | enum('F','M')        | YES  |     | NULL    |                |
    +--------+----------------------+------+-----+---------+----------------+
    4 rows in set (0.00 sec)
    MariaDB [hellodb]> insert teachers (name,age) values('leo',33);
    Query OK, 1 row affected (0.00 sec)
    
    MariaDB [hellodb]> select *from teachers;
    +-----+---------------+-----+--------+
    | TID | Name          | Age | Gender |
    +-----+---------------+-----+--------+
    |   1 | Song Jiang    |  45 | M      |
    |   2 | Zhang Sanfeng |  94 | M      |
    |   3 | Miejue Shitai |  77 | F      |
    |   4 | Lin Chaoying  |  93 | F      |
    |   5 | leo           |  33 | NULL   |
    +-----+---------------+-----+--------+
    5 rows in set (0.00 sec)
    
    在主服务器上同样的插入一条记录:
    MariaDB [hellodb]> insert teachers (name,age) values('ivan',38);
    Query OK, 1 row affected (0.00 sec)
    
    在从服务器上可以看到1062同步错误:
    MariaDB [hellodb]> show slave status\G
    ...
    						 Slave_IO_Running: Yes
    						Slave_SQL_Running: No
    									 Last_Errno: 1062
    									 Last_Error: Error 'Duplicate entry '5' for key 'PRIMARY'' on query. Default database: 'hellodb'. Query: 'insert teachers (name,age) values('ivan',38)'
    ...
    
  2. 以上问题我们可以使用忽略同步的错误,继续同步,然后再找出错误的原因 在从服务器上:
    MariaDB [hellodb]> stop slave;                           
    Query OK, 0 rows affected (0.00 sec)
    
    MariaDB [hellodb]> set global sql_slave_skip_counter = 1; #数字代表忽略几个错误
    Query OK, 0 rows affected (0.00 sec)
    
    MariaDB [hellodb]> show slave status\G
    *************************** 1. row ***************************
    							 Slave_IO_State: Waiting for master to send event
    									Master_Host: 192.168.205.17
    									Master_User: repluser
    									Master_Port: 3306
    								Connect_Retry: 60
    							Master_Log_File: bin.000005
    					Read_Master_Log_Pos: 881
    							 Relay_Log_File: mariadb-relay-bin.000005
    								Relay_Log_Pos: 523
    				Relay_Master_Log_File: bin.000005
    						 Slave_IO_Running: Yes
    						Slave_SQL_Running: Yes
    ...
    
  3. 或者把错误编号添加到配置文件中:
[root@slave ~]#vi /etc/my.cnf
[mysqld]
skip_slave_errors = 1062,1077 #可以使用错误编号,多个用逗号,分开,
[root@slave ~]#systemctl restart mariadb

多个从节点的复制,并模拟主节点down机

  1. 我们再加一下从节点slave2: 192.168.205.37
[root@slave2 ~]#yum install mariadb-server
[root@slave2 ~]#vi /etc/my.cnf
[mysqld]
server-id=37
datadir=/data/mysql
log-bin=/data/logs/bin
read-only
[root@slave2 ~]#mkdir /data/{mysql,logs}
[root@slave2 ~]#chown mysql:mysql /data/{mysql,logs}
[root@slave2 ~]#systemctl start mariadb
  1. 在主节点上完全备份数据库
[root@Master ~]#mysqldump -A --single-transaction --master-data=1 -F >/data/all2.mysql  
[root@Master ~]#scp /data/all2.mysql 192.168.205.37:/data
  1. 在第二个从节点上修改配置文件,并启动线程
[root@slave2 ~]#vi /data/all2.mysql 
...
CHANGE MASTER TO
MASTER_HOST='192.168.205.17', 
MASTER_USER='repluser', 
MASTER_PASSWORD='centos', 
MASTER_LOG_FILE='bin.000006',
MASTER_LOG_POS=245;
...
MariaDB [zhaoli]> source /data/all2.mysql
MariaDB [zhaoli]> start slave;
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.205.17
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: bin.000006
          Read_Master_Log_Pos: 245
               Relay_Log_File: mariadb-relay-bin.000002
                Relay_Log_Pos: 523
        Relay_Master_Log_File: bin.000006
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...
  1. 主服务器上执行如下的一个存储过程, 模拟数据正在更新
	use test
	create table test (id int auto_increment primary key,name char(10));

	delimiter $$

	create procedure  proc_test() 
	begin  
	declare i int;
	set i = 1; 
	while i < 100000 
	do  insert into test(name) values (concat('zhao',i)); 
	set i = i +1; 
	end while; 
	end$$

	delimiter ;

	MariaDB [test]> call proc_test;
  1. 模拟主服务器故障(你可以关机呀),查看所有的从服务器,找一个最新的日志服务提为主。 slave服务器:
MariaDB [test]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Reconnecting after a failed master event read
                  Master_Host: 192.168.205.17
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: bin.000006   #复制的日志文件
          Read_Master_Log_Pos: 8981180      #复制的日志位置
               Relay_Log_File: mariadb-relay-bin.000009
                Relay_Log_Pos: 8981458
        Relay_Master_Log_File: bin.000006
             Slave_IO_Running: Connecting
            Slave_SQL_Running: Yes

slave2服务器

MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Reconnecting after a failed master event read
                  Master_Host: 192.168.205.17
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: bin.000006
          Read_Master_Log_Pos: 8981180
               Relay_Log_File: mariadb-relay-bin.000002
                Relay_Log_Pos: 8981458
        Relay_Master_Log_File: bin.000006
             Slave_IO_Running: Connecting
            Slave_SQL_Running: Yes
  1. 这里我们把slave: 192.168.205.27提升为主节点 先停掉slave线程:
MariaDB [(none)]> stop slave;
Query OK, 0 rows affected (0.00 sec)

看一下目录结构:

[root@slave data]#ll /data/mysql/
total 37516
-rw-rw---- 1 mysql mysql    16384 Aug  5 22:10 aria_log.00000001
-rw-rw---- 1 mysql mysql       52 Aug  5 22:10 aria_log_control
drwx------ 2 mysql mysql     4096 Aug  5 20:10 db1
drwx------ 2 mysql mysql     4096 Aug  5 21:51 db2
drwx------ 2 mysql mysql     4096 Aug  5 21:51 db3
drwx------ 2 mysql mysql     4096 Aug  5 22:34 db4
drwx------ 2 mysql mysql     4096 Aug  5 19:39 hellodb
-rw-rw---- 1 mysql mysql 18874368 Aug  5 22:37 ibdata1
-rw-rw---- 1 mysql mysql  5242880 Aug  5 22:37 ib_logfile0
-rw-rw---- 1 mysql mysql  5242880 Aug  5 22:37 ib_logfile1
-rw-rw---- 1 mysql mysql      333 Aug  5 22:24 mariadb-relay-bin.000008
-rw-rw---- 1 mysql mysql  8981458 Aug  5 22:37 mariadb-relay-bin.000009
-rw-rw---- 1 mysql mysql       54 Aug  5 22:24 mariadb-relay-bin.index
-rw-rw---- 1 mysql mysql       82 Aug  5 22:47 master.info
drwx------ 2 mysql mysql     4096 Aug  5 19:39 mysql
drwx------ 2 mysql mysql     4096 Aug  5 18:25 performance_schema
-rw-rw---- 1 mysql mysql       54 Aug  5 22:47 relay-log.info
drwx------ 2 mysql mysql     4096 Aug  5 22:36 test
drwx------ 2 mysql mysql     4096 Aug  5 19:39 zhaoli

清除slave信息:

MariaDB [(none)]> reset slave;
Query OK, 0 rows affected (0.01 sec)

查看一下目录结构,发现少了master.info, relay-log.info, 并ralaylog重新开始新的:

[root@slave data]#ll /data/mysql/
total 28736
-rw-rw---- 1 mysql mysql    16384 Aug  5 22:10 aria_log.00000001
-rw-rw---- 1 mysql mysql       52 Aug  5 22:10 aria_log_control
drwx------ 2 mysql mysql     4096 Aug  5 20:10 db1
drwx------ 2 mysql mysql     4096 Aug  5 21:51 db2
drwx------ 2 mysql mysql     4096 Aug  5 21:51 db3
drwx------ 2 mysql mysql     4096 Aug  5 22:34 db4
drwx------ 2 mysql mysql     4096 Aug  5 19:39 hellodb
-rw-rw---- 1 mysql mysql 18874368 Aug  5 22:37 ibdata1
-rw-rw---- 1 mysql mysql  5242880 Aug  5 22:37 ib_logfile0
-rw-rw---- 1 mysql mysql  5242880 Aug  5 22:37 ib_logfile1
-rw-rw---- 1 mysql mysql      264 Aug  5 22:47 mariadb-relay-bin.000001
-rw-rw---- 1 mysql mysql       27 Aug  5 22:47 mariadb-relay-bin.index
drwx------ 2 mysql mysql     4096 Aug  5 19:39 mysql
drwx------ 2 mysql mysql     4096 Aug  5 18:25 performance_schema
drwx------ 2 mysql mysql     4096 Aug  5 22:36 test
drwx------ 2 mysql mysql     4096 Aug  5 19:39 zhaoli

全部清除包括同步的信息,所有同步信息都丢失:

MariaDB [(none)]> reset slave all;
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> show slave status\G
Empty set (0.00 sec)
  1. 修改slave:192.168.205.27配置文件符合为主的配置,不能为只读,且必须启用二进制日志
[root@slave data]#vi /etc/my.cnf
[mysqld]
datadir=/data/mysql
log-bin=/data/logs/bin
#read-only
server-id=27  
[root@slave data]#systemctl restart mariadb 
  1. slve: 192.168.205.27,此时的帐号由于之前备份前创建,所以帐号不需要重建,否则你需要重新建立帐号:
MariaDB [mysql]> select user,host,password from user;
+----------+---------------------+-------------------------------------------+
| user     | host                | password                                  |
+----------+---------------------+-------------------------------------------+
| repluser | 192.168.205.%       | *128977E278358FF80A246B5046F51043A2B1FCED |
+----------+---------------------+-------------------------------------------+
7 rows in set (0.00 sec)

确定从那个地方进行复制:

MariaDB [mysql]> show master logs;
+------------+-----------+
| Log_name   | File_size |
+------------+-----------+
| bin.000001 |     30373 |
| bin.000002 |   1038814 |
| bin.000003 |       264 |
| bin.000004 |    522418 |
| bin.000005 |       264 |
| bin.000006 |       245 |
+------------+-----------+
6 rows in set (0.00 sec)
  1. 修改slave2: 192.168.205.37 的同步信息,修改前一定要先停止slave
	MariaDB [(none)]> stop slave;
	Query OK, 0 rows affected (0.00 sec)

	MariaDB [(none)]> reset slave all;
	Query OK, 0 rows affected (0.00 sec)

	MariaDB [(none)]> CHANGE MASTER TO 
			-> MASTER_HOST='192.168.205.27', #此处为slave IP 192.168.205.27
			-> MASTER_USER='repluser', 
			-> MASTER_PASSWORD='centos', 
			-> MASTER_LOG_FILE='bin.000006', #日志同上show master logs相同
			-> MASTER_LOG_POS=245;   #起始位置与show master logs相同
	Query OK, 0 rows affected (0.01 sec)
	MariaDB [(none)]> start slave;
	Query OK, 0 rows affected (0.00 sec)

	MariaDB [(none)]> show slave status\G
	*************************** 1. row ***************************
								 Slave_IO_State: Waiting for master to send event
										Master_Host: 192.168.205.27
										Master_User: repluser
										Master_Port: 3306
									Connect_Retry: 60
								Master_Log_File: bin.000006
						Read_Master_Log_Pos: 245
								 Relay_Log_File: mariadb-relay-bin.000002
									Relay_Log_Pos: 523
					Relay_Master_Log_File: bin.000006
							 Slave_IO_Running: Yes
							Slave_SQL_Running: Yes
...
  1. 在新主服务器上slave:192.168.205.27删除一个库文件,测试同步情况
	MariaDB [mysql]> show databases;
	+--------------------+
	| Database           |
	+--------------------+
	| information_schema |
	| db1                |
	| db2                |
	| db3                |
	| db4                |
	| hellodb            |
	| mysql              |
	| performance_schema |
	| test               |
	| zhaoli             |
	+--------------------+
	10 rows in set (0.00 sec)

	MariaDB [mysql]> drop database db4;
	Query OK, 0 rows affected (0.00 sec)
  1. 在新从服务器slave2:192.168.205.37上查看,db4已经被删
MariaDB [(none)]>   show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| db1                |
| db2                |
| db3                |
| hellodb            |
| mysql              |
| performance_schema |
| test               |
| zhaoli             |
+--------------------+
9 rows in set (0.00 sec)