MySQL主从复制与集群

主要是掌握主从复制的原理和实现过程,搭建MySQL主从集群

MySQL单节点工作的弊端:存在单点故障,导致整个系统崩溃的问题,无法查询等

扩展到MySQL集群的好处:需要将同一份数据放在两个DB server上,主从复制保证数据的一致性,基于binlog

可以在读写分离/主从复制架构中加入proxy代理,对外只有一个IP/域名,内部的机制

1.消除单点故障

2.提高读写性能,提高性能瓶颈

3.使用读写分离,写(更新机)和读(查询机)分开部署,提高查询库的用途--->开源软件可以用mycat来代替proxy

mysql5.5主从集群配置 mysql主从和集群_mysql

MySQL主从复制原理:异步复制

mysql5.5主从集群配置 mysql主从和集群_mysql5.5主从集群配置_02

主从复制的方式:异步复制,主从节点之间可能会有数据的丢失,因为采用异步复制不保证同步的一定的实时性

异步复制在复制同步大量的数据时:比如执行存储过程,可能是百万千万条记录一起插入到库中,会导致同步时延非常大,可能到1min往上了

[hellodb]>show slave status \G;

Seconds_Behind_Master: 21

1.开启binlog

master主节点开启二进制功能binlog,开启两个binlog开关,最好是主从节点都开启

新数据更新写操作,先写事务日志redo_log,在写binlog

show variables like "%log_bin%";
show variables like "%sql_log_bin%";

set global sql_log_bin=1;
vim /etc/my.cnf
log_bin=/data/mysql/binlog
systemctl restart mysqld

flush logs; --更新日志文件,更新Binlog

2.开启dump同步线程

服务线程---IO线程---sql thread

同步机制:异步复制,下发写操作,更新,返回客户端(无论是本地socket也好,还是远程工具也好),最好是用异步复制,效率比较高;数据库一般都是并发读写操作的

实时同步:同步数据到从节点成功后,才返回写操作OK

实时同步binlog到从节点,同步binlog的pos,保证位置的一致性

binlog同步完成后写入到中继日志relay_log,通过sql thread写入到从库中,完成数据更新,其实relay_log就是主节点binlog同步后的存放位置,是主节点的数据更新

--创建一个用户用于同步,允许访问所有库

--指定从哪个pos开始查看并导出binlog日志
mysqlbinlog --start-position=524 /data/mysql/binlog/rocky-bin.000003 > /backup/binlog.sql

各种复制架构

mysql5.5主从集群配置 mysql主从和集群_mysql5.5主从集群配置_03

1.首选主--->从,比较均衡

2.1主2从,增加冗余性,1主从2从,增加冗余性,但是复制存在延迟

3.两个主,互为主备节点,存在双写数据冲突的问题,可以配置读写分离代理实现,master1+2作为读写节点,一个坏了另一个马上升级为主节点,提供服务

在proxy上指定一下读写操作指向节点的IP就行

搭建主从架构的问题:最好是数据库版本一致,从节点最好是使用高版本,因为不同版本之间存在binlog日志格式不一致的问题,高版本可以兼容低版本的

slave:高版本,5.7-8.0,最好是重装,然后导入完整备份就行

master:低版本,5.7-8.0

mysql5.5主从集群配置 mysql主从和集群_mysql5.5主从集群配置_04

搭建MySQL主从同步环境

环境:主master 10.0.0.132,基本步骤,注意:主从复制关系在机器服务重启后仍然保持同步状态;从节点关机后,在开机后马上可以同步过来,如果数据量大存在一定的延迟

从slave:10.0.0.128

version:mysql 8.0.30

1.主节点
1) 二进制日志
2) server_id=
3) 创建用户,replication slave 权限
4) 全备份 --master-data=1,--master-data=2,binlog的pos加上注释

2.从节点
1) server_id=
2) source /data/full.sql 导入备份
3) CHANGE MASTER TO      修改主节点信息
  MASTER_HOST='10.0.0.8',
  MASTER_USER='repluser',
  MASTER_PASSWORD='123456',
  MASTER_PORT=3306,
MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=30489887,
MASTER_CONNECT_RETRY=5;
4) start slave;

stop slave;
reset slave all; ##重置从节点
master status:执行DML语句后的pos开始复制
start slave;

3.主主架构

1.开启binlog

如果是8.0及以下得版本,则需要手动开启binlog

##8.0默认开启binlog
show variables like "%log_bin%";
show variables like "%sql_log_bin%";

set global sql_log_bin=1;

vim /etc/my.cnf
log_bin=/data/mysql/binlog/matser-bin --->指定Binlog日志文件的前缀
systemctl restart mysqld

flush logs; --更新日志文件,更新Binlog

2.配置主从节点基本信息

##主节点,修改hostname	
hostnamectl set-hostname master

##修改mysqld配置文件中的server-id,可以配置为1,每个节点的server-id一定要不一样,不然无法区分
可以配置为IP地址最后一位
vim /etc/my.cnf
[mysqld]
server-id=132

##查看主节点的二进制文件信息,不用指定任何库,直接复制所有新增写入的数据,从这个binlog的pos开始复制,--start-position=157
[(none)]>show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| rocky-bin.000019 |      157 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+

3.主节点创建同步账号,授权

revoke回收权限

##查看catyer这个用户的权限
show grants for 'catyer'@'10.0.0.%';
+-----------------------------------------------------------+
| Grants for catyer@10.0.0.%                                |
+-----------------------------------------------------------+
| GRANT USAGE ON *.* TO `catyer`@`10.0.0.%`                 |
| GRANT ALL PRIVILEGES ON `zabbix`.* TO `catyer`@`10.0.0.%` |
+-----------------------------------------------------------+

##回收删除权限
revoke delete on hellodb.* from catyer@'10.0.0.%';
flush privileges;

##回收所有权限
revoke all on hellodb.* from catyer@'10.0.0.%';
flush privileges;

各版本之间的差异,一定要执行grant,不然会出现无法生成master节点的dump线程

##早期版本5.7左右,授权+创建账号
grant replication slave on *.* to sync@'%‘ identified by '123';

##MySQL 8.0版本
create user xxx@'%' identified by '123';
show grants for sync@'%';

##回收权限,刷新权限,赋予复制权限
revoke all on *.* from sync@'%';

grant replication slave on *.* to sync@'%';
flush privileges;

[(none)]>show grants for sync@'%';
+----------------------------------------------+
| Grants for sync@%                            |
+----------------------------------------------+
| GRANT REPLICATION SLAVE ON *.* TO `sync`@`%` |
+----------------------------------------------+
1 row in set (0.01 sec)

这一条是failed,证明sync用户没有replication的权限

mysql5.5主从集群配置 mysql主从和集群_mysql5.5主从集群配置_05

4.主节点做一份全备,导入到备节点,确保初始数据一致

可以考虑将所有库都备份导出一次,形成不同的.sql文件,然后使用存储过程批量导入.sql数据,写成脚本或者是MySQL的存储过程--->可以查一下,直接或者写成事务,批量化执行,最好是脚本

因为从库可能是新库,初始数据不一致,如果直接同步的话那就只有新数据,没有老数据了

单库备份

#!/bin/bash
TIME=`date +%Y-%m-%d_%H-%M-%S`
DIR=/data/backup
DB="hellodb"
PASS=123
HOST=10.0.0.128

##-B表示选择
[ -d $DIR ] && echo "dir exist" || mkdir -p $DIR
mysqldump -uroot -p$PASS --single-transaction --source-data=2 -B $DB -F | gzip > ${DIR}/${DB}.${TIME}.sql.gz
#mysqldump -usync -p$PASS -h$HOST --single-transaction --source-data=2 -A | gzip > /data/backup/backup.sql.gz

scp ${DIR}/${DB}.${TIME}.sql.gz $HOST:/data/backup

多库备份,除去系统库+expect ssh交互

#!/bin/bash
TIME=`date +%F_%T`
DIR=/data/backup
DB="hellodb"
PASS=123

##批量化复制主机
HOST=(10.0.0.128
10.0.0.129)


[ -d $DIR ] && echo "dir exist" || mkdir -p $DIR
mysqldump -uroot -p$PASS --single-transaction --source-data=2 -A -F > ${DIR}/${DB}_${TIME}_all.sql
#mysqldump -usync -p$PASS -h$HOST --single-transaction --source-data=2 -A | gzip > /data/backup/backup.sql.gz

for IP in ${HOST[*]};do
        #免交互复制ssh-keygen到目的主机
        expect <<EOF
        spawn ssh-copy-id root@$IP
        expect {
                "yes/no" {send "yes\n";exp_continue}
                "password" {send "123\n"}
        }
        expect eof
EOF
        scp ${DIR}/${DB}_${TIME}_all.sql $IP:/data/backup
        echo "$IP copy success"
done

完全备份-A,直接source就行,不用选库

##-A完全备份,-F刷新日志
##!/bin/bash
TIME=`date +%Y-%m-%d_%H-%M-%S`
DIR=/data/backup
DB="hellodb"
PASS=123
HOST=10.0.0.128

[ -d $DIR ] && echo "dir exist" || mkdir -p $DIR
mysqldump -uroot -p$PASS --single-transaction --source-data=2 -A -F > ${DIR}/${DB}_${TIME}_all.sql
#mysqldump -usync -p$PASS -h$HOST --single-transaction --source-data=2 -A | gzip > /data/backup/backup.sql.gz

scp ${DIR}/${DB}_${TIME}_all.sql $HOST:/data/backup

[zabbix]>source /data/backup/hellodb_2022-09-11_09-15-11_all.sql

主节点看一下dump线程,在运行的线程

show processlist;
[(none)]>show processlist;
+----+-----------------+-----------+------+---------+------+------------------------+------------------+
| Id | User            | Host      | db   | Command | Time | State                  | Info             |
+----+-----------------+-----------+------+---------+------+------------------------+------------------+
|  5 | event_scheduler | localhost | NULL | Daemon  | 7920 | Waiting on empty queue | NULL             |
| 52 | root            | localhost | NULL | Query   |    0 | init                   | show processlist |
+----+-----------------+-----------+------+---------+------+------------------------+------------------+

单表备份:直接备份一个库内的一个表

也可以备份多张表

##假如不小心清空从库表的数据
truncate table stu;
[hellodb]>select * from stu;
Empty set (0.00 sec)

##导出单表,还原
mysqldump -uroot -p123 hellodb stu > /data/stu.sql
scp /data/stu.sql 10.0.0.128:/data/

source /data/stu.sql
[hellodb]>select * from stu where stuid=25;
+-------+-------------+-----+--------+
| StuID | Name        | Age | Gender |
+-------+-------------+-----+--------+
|    25 | Sun Dasheng | 100 | M      |
+-------+-------------+-----+--------+

##备份多张表
mysqldump -uroot -p123 hellodb stu teachers job > /data/stu.sql

5.从节点配置文件以及导入备份,基于线程来复制

可以先导入文件,再show master status记录下binlog的pos和名字,再执行change master

vim /etc/my.cnf
read-only ##只读,防止普通用户修改数据,不能防止root

CHANGE MASTER TO
  MASTER_HOST='10.0.0.132',
  MASTER_USER='sync',
  MASTER_PASSWORD='123',
  MASTER_PORT=3306,
  MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=30489887,
  MASTER_CONNECT_RETRY=5;

##取消掉注释,复制过去,从节点加上binlog选项
CHANGE MASTER TO
MASTER_HOST='10.0.0.132',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_LOG_FILE='rocky-bin.000783', MASTER_LOG_POS=197;

mysql5.5主从集群配置 mysql主从和集群_数据_06

关闭二进制,再导入,查看从节点的信息,发生生成了中继日志,即master的binlog写过来的日志存放文件relay_log,SLAVE_IO还是not running,开启复制线程

set sql_log_bin=0;
source /data/backup/hellodb_2022-09-11_10-48-26_all.sql

show slave status\G;
[zabbix]>show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: 10.0.0.132
                  Master_User: sync
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: rocky-bin.000021
          Read_Master_Log_Pos: 157
               Relay_Log_File: master-relay-bin.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: rocky-bin.000021
             Slave_IO_Running: No
            Slave_SQL_Running: No
            
start slave;
show processlist;
IO线程+SQL线程,IO线程是从主节点接收同步信息,通过SQL thread写入到从库的数据中,Query thread,显示waiting for updates
主节点:binlog dump线程

mysql5.5主从集群配置 mysql主从和集群_sql_07

mysql5.5主从集群配置 mysql主从和集群_数据_08

6.测试主从环境

主写新,看从有无,可以执行存储过程

测试执行存储过程,从节点直接SQL thread断掉了

[hellodb]>select * from teachers;
+-----+---------------+-----+--------+
| TID | Name          | Age | Gender |
+-----+---------------+-----+--------+
|   1 | Song Jiang    |  45 | M      |
|   2 | Zhang Sanfeng |  94 | M      |
|   3 | Miejue Shitai |  77 | F      |
|   4 | Lin Chaoying  |  93 | F      |
|   5 | catyer        |  16 | M      |
|   6 | catyer2       |  17 | M      |
+-----+---------------+-----+--------+

##主节点写入新数据,删除数据,没了
[hellodb]>delete from teachers where tid=6;
Query OK, 1 row affected (0.00 sec)

[hellodb]>select * from teachers;
+-----+---------------+-----+--------+
| TID | Name          | Age | Gender |
+-----+---------------+-----+--------+
|   1 | Song Jiang    |  45 | M      |
|   2 | Zhang Sanfeng |  94 | M      |
|   3 | Miejue Shitai |  77 | F      |
|   4 | Lin Chaoying  |  93 | F      |
|   5 | catyer        |  16 | M      |
+-----+---------------+-----+--------+
5 rows in set (0.00 sec)

##测试执行存储过程
insert into teachers(name,age,gender) values('a',22,'M');
select * from teachers;

出现Slave_IO_Running: NO的解决办法:可能是用户没权限

##复制账号没权限,从节点查看
[(none)]>show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for source to send event
                  Master_Host: 10.0.0.132
                  Master_User: sync
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: rocky-bin.000022
          Read_Master_Log_Pos: 60979553
               Relay_Log_File: master-relay-bin.000002
                Relay_Log_Pos: 326
        Relay_Master_Log_File: rocky-bin.000022
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

##授权
grant replication slave on *.* to sync@'%';
flush privileges;

[(none)]>show grants for sync@'%';
+----------------------------------------------+
| Grants for sync@%                            |
+----------------------------------------------+
| GRANT REPLICATION SLAVE ON *.* TO `sync`@`%` |
+----------------------------------------------+
1 row in set (0.01 sec)

出现Slave_SQL_Running:No的解决办法

重新执行一下slave的change master语句,记录下master的binlog pos,修改执行

可以查看日志,错误日志:mysql-error.log,可以自定义一下错误日志的位置

##从停止slave
stop slave;

##主查看binlog日志的pos
show master status;
[hellodb]>show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| rocky-bin.000022 | 60979553 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+

##从手动修改Pos,执行
CHANGE MASTER TO
MASTER_HOST='10.0.0.132',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_LOG_FILE='rocky-bin.000784', MASTER_LOG_POS=197;

##启动slave
start slave

##可能是表结构不一致,需要修改表结构字段
2022-09-11T03:39:17.143632Z 28 [Warning] [MY-010584] [Repl] Slave SQL for channel '': ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: MY-001756

desc testlog;
show create table testlog;

执行一个存储过程后SQL_running=NO的解决,通过show master status的报错查看

 

主要是看errorlog来拍错,让我们看一下error log或者是performance_schema.replication_applier_status_by_worker,这个表来查看有什么原因导致的

在同步前,新的数据要是master节点存在,slave节点不存在的,就算是空表也不行

##先看错误现象,SQL_RUNNING为NO,让看看错误日志
[hellodb]>show slave status \G;

Seconds_Behind_Master: 21
Slave_SQL_Running: No
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'ANONYMOUS' at master log rocky-bin.000023, end_log_pos 444. 
See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.

##查看错误日志,好像是表的什么问题,再看下一个表
tail -f mysql.log
2022-09-11T03:34:59.334704Z 23 [ERROR] [MY-013146] [Repl] Slave SQL for channel '': Worker 1 failed executing transaction 'ANONYMOUS' at master log rocky-bin.000022, end_log_pos 60979827; Column 1 of table 'hellodb.testlog' cannot be converted from type 'char(30(bytes))' to type 'char(40(bytes) utf8mb4)', Error_code: MY-013146

##查出原因了,这个testlog表已经存在了,不能往里面写数据,在同步前,新的数据
[hellodb]>select * from performance_schema.replication_applier_status_by_worker\G;

LAST_ERROR_MESSAGE: Worker 1 failed executing transaction 'ANONYMOUS' at master log rocky-bin.000023, end_log_pos 444; 
Error 'Table 'testlog' already exists' on query. Default database: 'hellodb'. Query: 'create table testlog (id int auto_increment primary key,name char(30),salary int default 20) character set utf8mb4'

##删除掉表,停止slave,启动slave,修改MASTER信息
drop table testlog;
stop slave

##从手动修改Pos,执行
CHANGE MASTER TO
MASTER_HOST='10.0.0.132',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_LOG_FILE='rocky-bin.000024', MASTER_LOG_POS=157;

start slave

##再次执行存储过程,在源端加点东西,先创建表
use hellodb;
create table testlog (id int auto_increment primary key,name char(30),salary int default 20) character set utf8mb4;

delimiter $$

create procedure sp_testlog() 
begin  
declare i int;
set i = 1; 
while i <= 100000 
do  insert into testlog(name,salary) values (concat('wang',FLOOR(RAND() * 100000)),FLOOR(RAND() * 1000000)); 
set i = i +1; 
end while; 
end$$

delimiter ;

source /root/testlog.sql
call sp_testlog();

##从节点有21s的复制同步延迟,已经优化很多了
show slave status \G;
[hellodb]>select count(*) from testlog;
+----------+
| count(*) |
+----------+
|    36626 |
+----------+

Seconds_Behind_Master: 21

##完成执行存储过程的同步
[hellodb]>select count(*) from testlog;
+----------+
| count(*) |
+----------+
|   100000 |
+----------+

show processlist;
49 | sync            | 10.0.0.128:60752 | NULL    | Binlog Dump |  1475 | Source has sent all binlog to replica; waiting for more updates | NULL
wait for events

ss -nt:查看已经建立的TCP连接,ESTABLISHED,10.0.0.128:业务端口已建立

mysql5.5主从集群配置 mysql主从和集群_数据_09

mysql5.5主从集群配置 mysql主从和集群_sql_10

7.同步堵塞(blocked),序号冲突,跳过错误

跳过指定事件,跳过错误数量

show variables like '%sql_slave_skip_counter%';

##跳过N个错误
set global sql_slave_skip_counter=N

有时候,从节点没有设置为read-only只读,然后在从节点新增了记录,比如id=9,而主节点又新增记录,序号也为9,那么这个9和从节点的9就不是同一个问题,导致冲突,造成复制堵塞(blocked),entity的记录冲突了

show slave status \G;
Slave_IO_Running: Yes
Slave_SQL_Running: Yes

##从节点写入一条记录,8号记录
insert into teachers(name,age,gender) values('tea',11,'M'); 
[hellodb]>select * from teachers where tid=8;
+-----+------+-----+--------+
| TID | Name | Age | Gender |
+-----+------+-----+--------+
|   8 | tea  |  11 | M      |
+-----+------+-----+--------+

##主节点插入一条记录
insert into teachers(name,age,gender) values('teach',20,'M'); 

##两边的tid-8的记录不一样了,查看一下错误
[hellodb]>select * from teachers;
+-----+---------------+-----+--------+
| TID | Name          | Age | Gender |
+-----+---------------+-----+--------+
|   1 | Song Jiang    |  45 | M      |
|   2 | Zhang Sanfeng |  94 | M      |
|   3 | Miejue Shitai |  77 | F      |
|   4 | Lin Chaoying  |  93 | F      |
|   5 | catyer        |  16 | M      |
|   7 | teach         |  20 | M      |
|   8 | teach         |  22 | M      |
+-----+---------------+-----+--------+
7 rows in set (0.00 sec)

[hellodb]>select * from teachers;
+-----+---------------+-----+--------+
| TID | Name          | Age | Gender |
+-----+---------------+-----+--------+
|   1 | Song Jiang    |  45 | M      |
|   2 | Zhang Sanfeng |  94 | M      |
|   3 | Miejue Shitai |  77 | F      |
|   4 | Lin Chaoying  |  93 | F      |
|   5 | catyer        |  16 | M      |
|   7 | uuu           |  11 | M      |
|   8 | tea           |  11 | M      |
+-----+---------------+-----+--------+
7 rows in set (0.00 sec)

##查出是entity 7序号7直接堵塞掉了,第七条出问题,导致后面的记录都无法复制过去了
##其实是7号和8号都冲突了,需要跳过两个错误
show slave status \G;
[hellodb]>select * from performance_schema.replication_applier_status_by_worker\G;

LAST_ERROR_MESSAGE: Worker 1 failed executing transaction 'ANONYMOUS' at master log rocky-bin.000024, end_log_pos 30689864; Could not execute Write_rows event on table hellodb.teachers; 
Duplicate entry '7' for key 'teachers.PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log rocky-bin.000024, end_log_pos 30689864

临时解决方案:跳过这个错误

stop slave;
##跳过2个错误
set global sql_slave_skip_counter=2;
select @@sql_slave_skip_counter;

start slave;
show slave status \G;
Slave_IO_Running: Yes
Slave_SQL_Running: Yes

##主节点插入,有了
[hellodb]>insert into teachers(name,age,gender) values('teach9',20,'M');
[hellodb]>select * from teachers where tid=9;
+-----+--------+-----+--------+
| TID | Name   | Age | Gender |
+-----+--------+-----+--------+
|   9 | teach9 |  20 | M      |
+-----+--------+-----+--------+

##错误解决办法,手动改,多行记录的话,手动改多行记录就行
update xxx set where

双主架构实现:互为主节点

将主节点也执行change master,使其变为从节点的从节点,容易造成数据冲突,不建议使用,双向同步,假如说用户无意间写到两个不同的主,而不是一份数据双写

最好还是写一份数据,安全,不会冲突

环境:主master 10.0.0.132,从slave 10.0.0.128,现在132也是128的从

1.从节点上查看MySQL的binlog,记录binlog日志pos
show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| slave-bin.000004 |      157 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+

2.修改change master

CHANGE MASTER TO
MASTER_HOST='10.0.0.128',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_LOG_FILE='slave-bin.000004', MASTER_LOG_POS=157;

##启动slave
start slave
##重置
reset slave all

半同步机制(重点):同步到一个从或者超过延迟时间就返回成功,加快事务的返回--->用于主从(主备架构)

mysql5.5主从集群配置 mysql主从和集群_sql_11

异步复制的优点:返回结果快,写入到主库了后就返回写入成功了,SQL query set

异步复制的缺陷:如果在同步的过程中,master dump thread挂了,则无法同步,从节点的数据也不是最新的;而且在slave库还需要完成SQL thread写入到磁盘,这段如果没完成也是导致数据不是最新的情况

半同步机制:加入了检查从库是否数据落盘的插件,只有主库---从库,从库的事务commit后,写入磁盘了,主库这边才返回写入成功;或者是超过了同步延时时间,也会返回写成功

基本过程(增强型半同步复制):优点,先检查是否同步,解决了主库返回成功后,访问从库(可能是只读库)没有数据,没同步到位的问题

1.用户提交事务(DML),执行SQL语句;客户端可能是本地socket,也可能是远端用户

2.写二进制日志

3.先等slave dump完成复制到slave

4.提交事务,写入磁盘

5.返回写成功

mysql5.5主从集群配置 mysql主从和集群_mysql_12

模拟环境搭建一主两从:rpl_semi_sync_master_enabled

主:132 ,从1:128,从2:131

插件:rpl_semi_sync_master_enabled,show plugins

1.同步主从数据,开启slave

##Master节点操作配置
show variables like '%rpl_semi_sync_master_enabled%';

vim /etc/my.cnf
rpl_semi_sync_master_enabled=on
rpl_semi_sync_master_timeout=

##查看系统自带的插件,安装插件
1.binlog、密码插件认证、守护进程,存储引擎等
[(none)]>show plugins;
+---------------------------------+----------+--------------------+---------+---------+
| Name                            | Status   | Type               | Library | License |
+---------------------------------+----------+--------------------+---------+---------+
| binlog                          | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| mysql_native_password           | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| sha256_password                 | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| caching_sha2_password           | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| sha2_cache_cleaner              | ACTIVE   | AUDIT              | NULL    | GPL     |
| daemon_keyring_proxy_plugin     | ACTIVE   | DAEMON             | NULL    | GPL     |
| CSV                             | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |

2.安装半同步插件rpl_semi_sync_master ##主从节点都要安装
install plugin rpl_semi_sync_master soname 'semisync_master.so';
Query OK, 0 rows affected, 1 warning (0.07 sec)

show plugins;
rpl_semi_sync_master            | ACTIVE   | REPLICATION        | semisync_master.so | GPL

3.修改配置文件
vim /etc/my.cnf
rpl_semi_sync_master_enabled=on ##开启半同步功能
rpl_semi_sync_master_timeout=10000 ##默认超时时间:10s,以毫秒为单位,默认10s

[(none)]>select @@rpl_semi_sync_master_timeout;
+--------------------------------+
| @@rpl_semi_sync_master_timeout |
+--------------------------------+
|                          10000 |
+--------------------------------+

systemctl restart mysqld

4.查看插件状态
[(none)]>select @@rpl_semi_sync_master_enabled;
+--------------------------------+
| @@rpl_semi_sync_master_enabled |
+--------------------------------+
|                              1 |
+--------------------------------+

5.完全备份到从库,确认log-bin的pos,##master和slave1都执行,scp到从库
grep -i "change master" hellodb_2022-09-12_11:45:11_all.sql

show master status;
-- CHANGE MASTER TO MASTER_LOG_FILE='rocky-bin.000030', MASTER_LOG_POS=157;

stop slave;
source /data/backup/hellodb_2022-09-12_11:45:11_all.sql

CHANGE MASTER TO
MASTER_HOST='10.0.0.132',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_LOG_FILE='rocky-bin.000030', MASTER_LOG_POS=157;

start slave;
show slave status \G;
##确认sql_io和slave_io都启动

##如果遇到以下问题,slave同步问题
stop slave;
reset slave all;
start slave;

2.从节点安装插件

完成两边的半同步插件安装,现在没有配置timeout,都是10s

##安装插件和卸载插件,主节点master
install plugin rpl_semi_sync_master soname 'semisync_master.so';
uninstall plugin rpl_semi_sync_master;

##从节点安装slave
install plugin rpl_semi_sync_slave soname 'semisync_slave.so';

select @@rpl_semi_sync_master_enabled;
+--------------------------------+
| @@rpl_semi_sync_master_enabled |
+--------------------------------+
|                              0 |
+--------------------------------+

1.动态设置变量
set global rpl_semi_sync_master_enabled=ON;
set global rpl_semi_sync_slave_enabled=ON;
show plugins;
[(none)]>select @@rpl_semi_sync_master_enabled;
+--------------------------------+
| @@rpl_semi_sync_master_enabled |
+--------------------------------+
|                              1 |
+--------------------------------+

2.主写入配置文件,从写入配置文件
vim /etc/my.cnf
[mysqld]
rpl_semi_sync_master_enabled
systemctl restart mysqld

#从写入配置文件
[mysqld]
rpl_semi_sync_slave_enabled
systemctl restart mysqld


##主节点测试查看,clients=2才是正确的
[hellodb]>show status like '%semi%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 2     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 0     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 0     |
| Rpl_semi_sync_master_tx_wait_time          | 0     |
| Rpl_semi_sync_master_tx_waits              | 0     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 0     |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)

3.测试10s的延迟,如果过了10s没有slave响应,则返回成功

可以修改全局变量超时时间,单位是ms毫秒

停掉两个slave的同步看看

stop slave;

[hellodb]>select @@rpl_semi_sync_master_timeout;
+--------------------------------+
| @@rpl_semi_sync_master_timeout |
+--------------------------------+
|                          10000 |
+--------------------------------+

##修改全局变量
set global rpl_semi_sync_master_timeout=2000;
Query OK, 0 rows affected (0.00 sec)

##无slave响应,需要10s返回成功,timeout值可以设置
[hellodb]>insert into teachers(name,age,gender) values('a',22,'M');
Query OK, 1 row affected (10.00 sec)

##有slave响应,秒回,证明半同步复制是成功的
start slave;
[hellodb]>insert into teachers(name,age,gender) values('a',22,'M');
Query OK, 1 row affected (0.02 sec)

##MySQL默认配置,在同步后才提交
[hellodb]>show variables like 'Rpl_semi_sync_master_wait_point';
+---------------------------------+------------+
| Variable_name                   | Value      |
+---------------------------------+------------+
| rpl_semi_sync_master_wait_point | AFTER_SYNC |
+---------------------------------+------------+

复制过滤器binlog-do-db

指定哪些库生成二进制文件日志

vim /etc/my.cnf

binlog_do_db=db1 ##白名单
binlog_ignore_db=db2 ##黑名单

mysql5.5主从集群配置 mysql主从和集群_数据_13

实现事务的并发自动复制,自动找binlog_pos:GTID(MySQL 5.7及以上)

好处:可以在主从复制关系中,不需要管理员或者是用户去关注binlog的pos,开启gtid,自动寻找事务的binlog点,自动复制新东西过来

支持多DUMP线程自动复制,大大降低延迟

事务的唯一性:GTID=server_uuid+transaction_id,是全局唯一的

server_id来源于:/data/mysql/auto.cnf

[root@rocky mysql]#cat auto.cnf 
[auto]
server-uuid=5aa76b1b-1b77-11ed-9219-000c29333d85

mysql5.5主从集群配置 mysql主从和集群_数据_14

开启GTID,主从节点都要开启这个GTID_MODE

vim /etc/my.cnf
[mysqld]
gtid_mode=on
enforce_gtid_consistency

systemctl restart mysqld

##查看主节点的pos
[hellodb]>show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| rocky-bin.000032 |      157 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+

##从节点配置
stop slave;

CHANGE MASTER TO
MASTER_HOST='10.0.0.132',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_AUTO_POSITION=1;
#MASTER_LOG_FILE='slave-bin.000004', MASTER_LOG_POS=157; --->这一段可以不要了

start slave;
##也可以
resrt slave all;

##已和主节点同步完成了,做复制方便很多了
[(none)]>show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for source to send event
                  Master_Host: 10.0.0.132
                  Master_User: sync
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: rocky-bin.000032
          Read_Master_Log_Pos: 157

MySQL主从复制的一些监控健康性(bash实现)

使用监控软件监控的时候,需要检查一些指标,使用bash看出来就好了

##监控复制的健康,两个进程,从节点slave
mysql -uroot -p123 -e 'show slave status\G' | awk -F':' '/Slave_IO_Running/{print $2}'
mysql -uroot -p123 -e 'show slave status\G' | awk -F':' '/Slave_SQL_Running/{print $2}'

show master status; ##主节点的binlog
show binlog events; ##查询binlog日志的内容
mysqlbinlog rocky-bin.000003 
show slave status \G;
show processlist;

stop slave
reset slave all
start slave

1.配置gtid,实现自动识别binlog的pos
2.减少大事务的产生,减少同步复制延迟
3.减少锁机制
4.建立主从机制

造成MySQL主从数据不一致的原因

mysql5.5主从集群配置 mysql主从和集群_mysql5.5主从集群配置_15

binlog格式(row级别或者是statement级别语句)、是否开启binlog、同步是否正常(show slave status)、服务是否正常,版本是否一致(最好是从节点高)等

主库的复制账号sync没有grant replication slave权限

定期监控主从之间的心跳问题,监控slave status状态

---->重新完全备份,导入,重新建立主从关系等

mycat:数据库proxy的实现

proxy服务器:用于写好逻辑,如果写请求,就到主节点;如果是读请求,就到从节点(更新机、查询机等)

垂直拆分:分库分表,关系不大的表可以拆分,如果是例如stu表和teacher表的话,那就是不太方便的;分库的话,可以和开发协商好;根据不同的业务类型进行拆分

水平拆分,放在不同的服务器中,以便解放性能瓶颈,或者读写分离等

--->通过proxy代理来区分数据需要放在什么节点上,proxy代理节点也要冗余双节点,以免出现单点故障的问题

主要功能:实现一个MySQL代理proxy的功能,通过proxy与后端多个MySQL服务器相连,实现大规模的MySQL服务器集群功能

mycat核心协议:JDBC,与多个主流数据库相连,mycat是对数据库层面的抽象

工作原理:前端(HA-proxy)负载均衡---mycat proxy代理集群(冗余节点)---转发请求到后端1-N个MySQL服务器节点(冗余节点),类似K8s集群中,我们需要两个nginx节点做转发一样,万一一个pod挂了,还有另一个nginx pod作为转发请求

mycat实现读写分离:

1.接收写操作请求:转发到后端主节点,更新机

2.接收读请求:转发到后端查询机,读节点

mysql5.5主从集群配置 mysql主从和集群_mysql5.5主从集群配置_16

mycat实现读写分离

mysql5.5主从集群配置 mysql主从和集群_mysql_17

环境:客户端ubuntu 10.0.0.131

mycat节点:10.0.0.130

主节点:10.0.0.132

从节点:10.0.0.128

1.下载包,解压缩到指定目录,配置执行的环境变量,这个可用于其他的服务,例如nginx编译安装

适用于所有编译安装的服务,可以将路径写到环境变量里面去,就不用人为进入到bin目录下了,类似tomcat也一样的,nginx等很多服务

tar xf Mycat-server-1.6.7.4-release-20200105164103-linux.tar.gz -C /mycat

##查看包内的内容,已经编译好的东西,可以直接使用
bin  catlet  conf  lib  logs  version.txt

##修改执行文件的path路径,让他能够直接执行,不需要加上执行文件路径
原路径:/apps/mycat/bin/mycat

vim /etc/profile.d/mycat.sh,让系统执行mycat指令的时候,直接去调用环境变量的PATH
PATH=/apps/mycat/bin:$PATH
##让环境变量生效
. /etc/profile.d/mycat.sh / source /etc/profile.d/mycat.sh

[11:01:44 root@proxy bin]#mycat
Usage: /apps/mycat/bin/mycat { console | start | stop | restart | status | dump }

2.安装java编译环境,修改mycat默认端口

升配一下mycat节点的内存,java进程比较吃内存

yum -y install java

##java 8版本
java -version
openjdk version "1.8.0_342"
OpenJDK Runtime Environment (build 1.8.0_342-b07)
OpenJDK 64-Bit Server VM (build 25.342-b07, mixed mode)

启动mycat,监听的是8066和9066两个端口,对外暴露的,默认两个对外端口,可以修改的

对外暴露的端口默认是3306的

mycat start
ss -ntl | grep 8066
LISTEN     0      100         :::8066                    :::*

##ss -ntlp:表示显示是哪个进程协议服务占用了这个端口,仅显示TCP连接,显示users,这个端口的进程
netstat -ntlp
[11:15:41 root@proxy ~]#ss -ntlp | grep 8066
LISTEN     0      100         :::8066                    :::*                   users:(("java",pid=11766,fd=86))
[11:15:51 root@proxy ~]#ss -ntlp | grep 9066
LISTEN     0      100         :::9066                    :::*                   users:(("java",pid=11766,fd=82))

##查看mycat的日志,/logs/wrapper.log
[11:18:26 root@proxy logs]#tail -f wrapper.log 
STATUS | wrapper  | 2022/09/17 11:14:07 | --> Wrapper Started as Daemon
STATUS | wrapper  | 2022/09/17 11:14:07 | Launching a JVM...
INFO   | jvm 1    | 2022/09/17 11:14:08 | Wrapper (Version 3.2.3) http://wrapper.tanukisoftware.org
INFO   | jvm 1    | 2022/09/17 11:14:08 |   Copyright 1999-2006 Tanuki Software, Inc.  All Rights Reserved.
INFO   | jvm 1    | 2022/09/17 11:14:08 | 
INFO   | jvm 1    | 2022/09/17 11:14:09 | MyCAT Server startup successfully. see logs in logs/mycat.log

修改mycat端口,默认这一段是注释掉的,需要复制新的,并且修改默认客户端端口号;对外暴露端口:3306,管理端口:9066

mycat默认连接用户:root,默认密码:123456,修改为123

/serverport,/user

/conf/server.xml
<!--	<property name="serverPort">8066</property> <property name="managerPort">9066</property> -->

<property name="serverPort">3306</property>
<property name="managerPort">9066</property> 

        <user name="root" defaultAccount="true">
                <property name="password">123</property>
                <property name="schemas">TESTDB</property>
                <property name="defaultSchema">TESTDB</property>

3.在后端MySQL上创建mycat连接用户

授权给到hellodb库的权限,读写权限,因为访问到proxy可以进行这个库的读写分离

[hellodb]>create user mycat@'%' identified by '123';
Query OK, 0 rows affected (0.01 sec)

[hellodb]>grant all on hellodb.* to mycat@'%';
Query OK, 0 rows affected (0.00 sec)

[hellodb]>flush privileges;
Query OK, 0 rows affected (0.01 sec)

4.在schema.xml上实现读写分离的效果

修改schema.xml文件,修改readhost和writehost,修改后短的节点

schema name:在mycat代理上显示给客户端看的DB名字

<?xml version="1.0"?>
<!DOCTYPE mycat:schema SYSTEM "schema.dtd">
<mycat:schema xmlns:mycat="http://io.mycat/">
    <schema name="TESTDB" checkSQLschema="false" sqlMaxLimit="100" dataNode="dn1">
    </schema>
    <dataNode name="dn1" dataHost="localhost1" database="hellodb" /> <!-- 后端DB名字 -->
    <dataHost name="localhost1" maxCon="1000" minCon="10" balance="1"
              writeType="0" dbType="mysql" dbDriver="native" switchType="1"  slaveThreshold="100">
        <heartbeat>select user()</heartbeat>
        <writeHost host="host1" url="10.0.0.132:3306" user="mycat" password="123"> <!-- 后端DB1的IP -->
         <readHost host="host2" url="10.0.0.128:3306" user="mycat" password="123" /> <!-- 后端DB2的IP -->
        </writeHost>
    </dataHost>
</mycat:schema>

验证修改server和schema后的结果

##重启服务
mycat restart

##查看日志,jvm1,证明是java编译的东西
tail -f /apps/mycat/logs/wrappers.log
INFO   | jvm 1    | 2022/09/17 14:15:46 | MyCAT Server startup successfully. see logs in logs/mycat.log

##查看机器的监听端口是不是修改成3306了
ss -ntlp | grep 3306
LISTEN     0      100         :::3306                    :::*                   users:(("java",pid=50461,fd=86))

##客户端连接到mycat代理,随便找个网络通的,查询发现就是hellodb的内容,可直接进行读写操作
mysql -uroot -p123 -h 10.0.0.130
mysql> show databases;;
+----------+
| DATABASE |
+----------+
| TESTDB   |
+----------+
1 row in set (0.00 sec)

mysql> show tables;
+-------------------+
| Tables_in_hellodb |
+-------------------+
| classes           |
| coc               |
| courses           |
| scores            |
| stu               |
| teachers          |
| testlog           |
| toc               |
+-------------------+
8 rows in set (0.01 sec)

5.验证mycat读写分离到server1和2的效果,使用general_log通用日志实现

##开启server1和server2的general_log日志
[hellodb]>select @@general_log;
+---------------+
| @@general_log |
+---------------+
|             0 |
+---------------+

set global general_log=1;
show variables like '%general_log%';

[hellodb]>show variables like '%general_log%';
+------------------+-----------------------+
| Variable_name    | Value                 |
+------------------+-----------------------+
| general_log      | ON                    |
| general_log_file | /data/mysql/rocky.log |
+------------------+-----------------------+

[(none)]>show variables like '%general_log%';
+------------------+------------------------+
| Variable_name    | Value                  |
+------------------+------------------------+
| general_log      | ON                     |
| general_log_file | /data/mysql/master.log |
+------------------+------------------------+

bash上开启
tail -f /data/mysql/rocky.log
tail -f /data/mysql/master.log

由ubuntu客户端向mycat进行写入
insert into teachers(name,age,gender) values('mycat',11,'M');

##位于132主节点,日志有记录,从节点上只有一个同步的事务记录下来了
2022-09-17T06:40:37.211746Z	  112 Query	insert into teachers(name,age,gender) values('mycat',11,'M')

2022-09-17T06:40:37.215598Z	    7 Query	BEGIN
2022-09-17T06:40:37.216535Z	    7 Query	COMMIT /* implicit, from Xid_log_event */

##查询一下查询的操作,是哪个server,看server id,查询的是128从节点
select @@server_id;
mysql> select @@server_id;
+-------------+
| @@server_id |
+-------------+
|         128 |
+-------------+

##更新操作试试,已经修改为serverid,正是132的id,证明读写分离是OK的
mysql> update teachers set age=@@server_id where tid=13;
Query OK, 1 row affected (0.01 sec)

mysql> select *from teachers where tid=13;;
+-----+-------+-----+--------+
| TID | Name  | Age | Gender |
+-----+-------+-----+--------+
|  13 | mycat | 132 | M      |
+-----+-------+-----+--------+

heartbeat() select user()用途

用于后端MySQL服务器的心跳检测,mycat服务器周期性的向后端MySQL机器发送健康检查的指令,=select一下用户,如果又返回结果,则证明可以MySQL服务是存活的

2022-09-17T06:33:46.730245Z	   20 Query	select user()
2022-09-17T06:33:56.730453Z	   16 Query	select user()
2022-09-17T06:34:06.731198Z	   18 Query	select user()

mycat暂时性的痛点

1.本身的proxy节点就存在单点故障的问题

2.无法保证后端主节点挂掉后,从节点可以替换上来,可以使用MHA高可用来实现

重点:MySQL主从复制的实现,主从数据的同步,binlog的pos记录,配置,可以在5.7后的版本开启gtid自动记录binlog的pos,可以比较简便的实现主从复制的binlog pos

mysql5.5主从集群配置 mysql主从和集群_sql_18

MySQL MHA实现高可用

实现从节点自动替换主节点工作,也就是sql server的always on方案

always on高可用方案:自动实现故障转移,A为主,B为从,A故障业务自动转向B,或者是有提出业务的方法等,建立always on集群节点,对外暴露一个虚拟的IP,程序只需要对接到这个IP,至于后端是哪台更新机/查询机工作,并不重要

---->always on高可用性组

恢复数据的办法:

通过MHA也好,alwaysOn方案也好,终究数据在同步的时候都是存在延迟的;而一个借点挂了的话,要从节点提升一个新主上来,从必然会有一部分的数据丢失,这一点可以由MHA或者是alwayson集群管理来实现,尽可能通过redo log/binlog来恢复数据,也可以人工手动恢复数据

MySQL的实现方案:MHA高可用方案---->非常常用,一般是一主一从,也可以一主多从等这样的架构,现在阿里云上面的RDS主备架构估计也就MHA机制实现的

高可用方案的可靠性:一定要做到非常可靠,当然数据完全不丢失也不太可能,最好做到尽量不丢失,主要还是看企业内的业务需要使用怎么样的数据库,对应好怎么样的高可用方案;不同类型的DB,高可用方案不同

mysql5.5主从集群配置 mysql主从和集群_mysql5.5主从集群配置_19

MHA的实现原理

1.利用健康检查语句select来检查master-slave的健康性,一旦有宕机,则manager从master上保存二进制文件,监控健康情况

2.识别哪个slave同步的数据最新,因为slave同步数据,也需要经过调用SQL thread,而这个SQL进程不一定已经落盘了

3.应用中继日志relay_log到其他节点

4.将从master保存的最新binlog同步到所有slave节点--->有个同步binlog的步骤--->也可以手动将master节点的binlog拷贝过来(如果机器在坏的不多的情况下,只有服务挂了)

5.从slave内提升一个新的master(manager行为)

6.其他的slave指向新主(manager行为)

7.移除故障节点,开始修复

MHA对外的IP地址:VIP虚拟IP,类似SQL SERVER的alyways on高可用方案,集群提供虚拟IP,通过虚拟IP来决定读/写操作,VIP自动识别新的master节点,承载写操作

安装部署MHA和MySQL主从环境

MHA的包:分为管理节点和agent节点,server端和client端,管理节点负责管理各个MySQL集群,MySQL集群内包括主从节点都是管理节点的client端

server端:只能在Cent OS 7机器

node端:可以在Cent OS 8以上

mysql5.5主从集群配置 mysql主从和集群_mysql5.5主从集群配置_20

MHA检查MySQL集群的健康性:也是通过select来完成健康检查,一旦无反馈,则会认为这个manager有问题,这个集群有问题,直接提升一个从节点作为主节点

类似mycat代理通过select user(),查询语句来检查,always on节点肯定也有健康检查

mysql5.5主从集群配置 mysql主从和集群_mysql_21

环境:可以加个客户端
10.0.0.130 manager/mycat
10.0.0.132 master节点
10.0.0.128 slave1
10.0.0.129 slave2

1.配置主从节点关系,主从复制,my.cnf文件配置

master节点基本配置:服务端,开启gtid,自动识别事务,自动发现binlog的pos,主从都要开,不开的话,从节点会报错

[mysqld]
server-id=132 ##id
datadir=/data/mysql ##数据目录
log-bin=/data/mysql/binlog/rocky-bin ##binlog目录,指定binlog文件前缀
socket=/data/mysql/mysql.sock
log-error=/data/mysql/mysql.log ##错误日志目录
pid-file=/data/mysql/mysql.pid
#skip-grant-tables
default_authentication_plugin=mysql_native_password
#general_log ##开启general_log,这个可以在MySQL内set global general_log=1实现

rpl_semi_sync_master_enabled ##半同步复制,可以设置timeout的时间

gtid_mode=on ##开启gtid,自动识别事务,自动发现binlog的pos
enforce_gtid_consistency

[client]
socket=/data/mysql/mysql.sock

slave基本配置:开启gtid,自动识别事务,自动发现binlog的pos,主从都要开,不开的话,从节点会报错

[mysqld]
server-id=128
log-bin=/data/mysql/binlog/slave-bin
datadir=/data/mysql
socket=/data/mysql/mysql.sock
log-error=/data/mysql/mysql.log
pid-file=/data/mysql/mysql.pid

rpl_semi_sync_slave_enabled ##半同步复制,可以设置timeout的时间

gtid_mode=on ##开启gtid,自动识别事务,自动发现binlog的pos,主从都要开,不开的话,从节点会报错
enforce_gtid_consistency
read_only

[client]
socket=/data/mysql/mysql.sock

2.安装管理端和node端

#!/bin/bash
host=(10.0.0.128
10.0.0.132
10.0.0.129)

##expect判断直接scp rpm包过去
for i in ${host[*]};do
        expect <<EOF
        spawn scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm $i:/root
        expect {
             "yes/no" { send "yes\n"; exp_continue }
             "password" { send "123\n" };
}
        expect eof
EOF

done

yum -y install mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
yum -y install mha4mysql-node-0.58-0.el7.centos.noarch.rpm

3.实现MySQL主从机器之间的SSH-KEY互信,以及manager端对于MySQL主从节点的互信

先生成自己的ssh-key,再将自己的信息复制到自己的节点,known_hosts里面已经有4个机器了,刚刚复制过

将自己的.ssh复制到其他机器的:/root目录下

:/root/.ssh:代表也拷贝.ssh目录本身

:/root/.ssh/*:代表拷贝.ssh目录下的东西

ssh-keygen
ssh-copy-id 10.0.0.130

[10:34:49 root@proxy ~]#cat .ssh/known_hosts 
10.0.0.132 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBOEAlD1m567k7Q2QEem7PjJOu2e91dNnnvu3BieR3f3TVFQdqsWqGI/qCgJVLQ7KWWPNFy137uYUwviZa7d8Q/Q=
10.0.0.128 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBGNiEH9kMJZU+HXl2fpHO2JQ9tIbiVh5qljJS+ZoGh1iNzX3S7pGM/Dd16BkSVJgrgMyXOdr832C7na6SIgoylM=
10.0.0.129 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBOf8Uo6Tm0vy+2g9ctYsTwc/tfWcXHQtY5EywK0odPY1KFdXvw/bi75eyTbBqMUnX90RY36V+SV2CF05KCP1FJ0=
10.0.0.130 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBOf8Uo6Tm0vy+2g9ctYsTwc/tfWcXHQtY5EywK0odPY1KFdXvw/bi75eyTbBqMUnX90RY36V+SV2CF05KCP1FJ0=

##复制ssh认证
#!/bin/bash
host=(10.0.0.128
10.0.0.132
10.0.0.129)

##expect判断直接scp rpm包过去,还是得先进行ssh-copy-id的验证
for i in ${host[*]};do
	  expect <<EOF
        spawn ssh-copy-id $i
        expect {
             "yes/no" { send "yes\n"; exp_continue }
             "password" { send "123\n" };
}
        expect eof
EOF
      scp /root/.ssh $i:/root
done

4.manager端配置

app1.cnf配置

##新建配置文件的目录
mkdir -p /data/mastermha/app1
touch /data/mastermha/app1

vim /data/mastermha/app1.cnf
##app1.cnf文件配置
[server default] 
user=mhauser ##在主节点上创建用户,自动同步到从节点--->完成
password=123
manager_workdir=/data/mastermha/app1/
manager_log=/data/mastermha/app1/manager.log ##自动创建管理端的日志
remote_workdir=/data/mastermha/app1/
ssh_user=root
repl_user=sync --->完成
repl_password=123
ping_interval=1
master_ip_failover_script=/usr/local/bin/master_ip_failover
report_script=/usr/local/bin/sendmail.sh
check_repl_delay=0
master_binlog_dir=/data/mysql/binlog ##binlog位置

[server1]
hostname=10.0.0.132
candidate_master=1 ##master优先做主
[server2]
hostname=10.0.0.128
candidate_master=1  ##master优先做主
[server3]
hostname=10.0.0.129

##检查binlog,也可以开启general_log观察结果
[(none)]>show variables like '%log_bin%';

##主从都开启general_log
select @@general_log;
set global general_log=ON;

MySQL集群节点授权mha账户

可以修改MySQL集群的配置,所以需要能够有所有库的所有权限,需要和app1.cnf内一致

##在主节点上创建用户,自动同步到从节点
[hellodb]>create user mhauser@'%' identified by '123';
Query OK, 0 rows affected (0.01 sec)

[hellodb]>grant all on *.* to mhauser@'%';
Query OK, 0 rows affected (0.00 sec)

[hellodb]>flush privileges;
Query OK, 0 rows affected (0.00 sec)

[hellodb]>select user,host from mysql.user;
+------------------+-----------+
| user             | host      |
+------------------+-----------+
| mhauser          | %         |
| mycat            | %         |
| root             | %         |
| sync             | %         |
| wp               | %         |
| catyer           | 10.0.0.%  |
| mysql.infoschema | localhost |
| mysql.session    | localhost |
| mysql.sys        | localhost |
| root             | localhost |
| wpuser           | localhost |
+------------------+-----------+
11 rows in set (0.00 sec)

master节点添加一个VIP,参考perl脚本的写法

master down了可以将master飘到别的master上

将perl脚本放到/use/local/bin下面,临时加

my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip"; ##网卡上加多一个IP地址,网卡名记得修改,
my $ssh_stop_vip = "/sbin/ifconfig eens33:$key down";

##master节点配置
ifconfig ens224
ifconfig ens224:1 10.0.0.100/24
2: ens224: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:0c:29:33:3d:85 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.132/24 brd 10.0.0.255 scope global noprefixroute ens224
       valid_lft forever preferred_lft forever
    inet 10.0.0.100/24 brd 10.0.0.255 scope global secondary ens224:1
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe33:3d85/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

perl脚本的实现:/usr/local/bin/master_ip_failover

把这个脚本放在这个路径下,记得修改执行权限
chmod +x /usr/local/bin/master_ip_failover

脚本实现

#!/usr/bin/env perl

#  Copyright (C) 2011 DeNA Co.,Ltd.
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#  Foundation, Inc.,
#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

## Note: This is a sample script and is not complete. Modify the script based on your environment.

use strict;
use warnings FATAL => 'all';

use Getopt::Long;
use MHA::DBHelper;

my (
  $command,        $ssh_user,         $orig_master_host,
  $orig_master_ip, $orig_master_port, $new_master_host,
  $new_master_ip,  $new_master_port,  $new_master_user,
  $new_master_password
);
my $vip = '10.0.0.100/24'; ##VIP给程序的显示
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig eens33:$key $vip"; ##网卡上加多一个IP地址,网卡名记得修改
my $ssh_stop_vip = "/sbin/ifconfig eens33:$key down";

GetOptions(
  'command=s'             => \$command,
  'ssh_user=s'            => \$ssh_user,
  'orig_master_host=s'    => \$orig_master_host,
  'orig_master_ip=s'      => \$orig_master_ip,
  'orig_master_port=i'    => \$orig_master_port,
  'new_master_host=s'     => \$new_master_host,
  'new_master_ip=s'       => \$new_master_ip,
  'new_master_port=i'     => \$new_master_port,
  'new_master_user=s'     => \$new_master_user,
  'new_master_password=s' => \$new_master_password,
);

exit &main();

sub main {
  if ( $command eq "stop" || $command eq "stopssh" ) {

    # $orig_master_host, $orig_master_ip, $orig_master_port are passed.
    # If you manage master ip address at global catalog database,
    # invalidate orig_master_ip here.
    my $exit_code = 1;
    eval {

      # updating global catalog, etc
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
    elsif ( $command eq "start" ) {

        # all arguments are passed.
        # If you manage master ip address at global catalog database,
        # activate new_master_ip here.
        # You can also grant write access (create user, set read_only=0, etc) here.
        my $exit_code = 10;
        eval {
            print "Enabling the VIP - $vip on the new master - $new_master_host \n";
            &start_vip();
            &stop_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn $@;
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "status" ) {
        print "Checking the Status of the script.. OK \n";
        `ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
        exit 0;
    }
    else {
        &usage();
        exit 1;
    }
}


sub start_vip() {
    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master 
sub stop_vip() {
   `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}


sub usage {
  print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

自动发送告警邮件实现

配置163邮箱的auth,需要update一下postfix的包,以免postfix这个邮件服务太老了

vim /etc.mail.rc
set from=catyer.163.com
set smtp=smtp.163.com
set smtp-auth-user=catyer.163.com
set smtp-auth-password=SOKFDDZOQCDEZCYY

yum -y install postfix

vim sendmail.sh
##记得修改执行权限
chmod +x sendmail.sh
echo "MHA is down!" | mail -s "MHA Warning" catyer@163.com

5.检查app1.cnf文件的语法

successfully,检查到各个节点的通信

##检查各个节点之间的SSH连接
masterha_check_ssh --conf=/root/mastermha/app1.cnf

##检查主从复制
masterha_check_repl --conf=/root/mastermha/app1.cnf

mysql5.5主从集群配置 mysql主从和集群_mysql5.5主从集群配置_22

抛出异常,无法识别super_read_only变量,缺少libmysqlclient.so.18依赖文件

不用修改任何的东西,只是缺少这个libmysqlclient.so.18依赖库文件

也不是什么super_read_only没开,都不关事的

mysql5.5主从集群配置 mysql主从和集群_sql_23

Checking if super_read_only is defined and turned on..install_driver(mysql) failed: Can't load '/usr/lib64/perl5/vendor_perl/auto/DBD/mysql/mysql.so' for module DBD::mysql: libmysqlclient.so.18: cannot open shared object file: No such file or directory at /usr/lib64/perl5/DynaLoader.pm line 190.
 at (eval 11) line 3.
Compilation failed in require at (eval 11) line 3

vim +244 /usr/share/perl5/vendor_perl/MHA/SlaveUtil.pm
my $sth = $dbh->prepare("SELECT 0 as Value");
#$dbh->do("SET GLOBAL super_read_only=off;");
$dbh->do("SET GLOBAL read_only=on;");

my $sth = $dbh->prepare("SELECT \@\@global.super_read_only as Value");
set global super_read_only=ON;

my $sth = $dbh->prepare("SELECT 1 as Value"); ##--->心跳检查

ln -s /usr/local/mysql/lib/libmysqlclient.so.21.2.30 /usr/lib64/libmysqlclient.so.21.2.30
wget -O /usr/lib/libmysqlclient.so.18 http://files.directadmin.com/services/es_7.0_64/libmysqlclient.so.18

set global global super_read_only=OFF;

解决办法:因为是二进制安装的,所以缺少这个包mysql-community-libs-compat-8.0.30-1.el7.x86_64.rpm,从官网下载了,不知道为啥安装不上,缺少依赖

yum -y install mysql-community-libs
上次元数据过期检查:0:45:58 前,执行于 2022年09月18日 星期日 15时49分55秒。
未找到匹配的参数: mysql-community-libs
错误:没有任何匹配: mysql-community-libs

##从别的机器使用yum安装mysql-8.0.30,拷贝/usr/lib64/mysql/libmysqlclient.so.18过来
host=(10.0.0.128
10.0.0.132
10.0.0.129)

for i in ${host[*]};do
        scp /usr/lib64/mysql/libmysqlclient.so.18 $i:/usr/lib64/mysql/
done
[root@master mysql]#ls
libmysqlclient.a  libmysqlclient.so  libmysqlclient.so.18  libmysqlservices.a

mysql5.5主从集群配置 mysql主从和集群_数据_24

check ok

mysql5.5主从集群配置 mysql主从和集群_mysql_25

6.启动MHA进程,查看日志;若集群内的主机挂掉后,需要再次执行开始MHA进程

管理节点上执行

masterha_manager --conf=/root/mastermha/app1.cnf --remove_dead_master_conf --ingore_last_failover

##挂在后台执行,生产中执行
nohup masterha_manager --conf=/root/mastermha/app1.cnf --remove_dead_master_conf --ingore_last_failover

mysql5.5主从集群配置 mysql主从和集群_数据_26

查看master的general_log,发现是有健康检查的条目的,只检查MASTER节点健康

[hellodb]>show variables like '%general_log%';

tail -f /data/mysql/rocky.log 
tail -f /data/mysql/master.log 
tail -f /data/mysql/slave1.log 
2022-09-18T08:56:24.776722Z	  205 Query	SELECT 1 As Value
2022-09-18T08:56:25.777208Z	  205 Query	SELECT 1 As Value
2022-09-18T08:56:26.777319Z	  205 Query	SELECT 1 As Value

tail -f /data/mastermha/app1/manager.log
ping select until mysql doesn't respond; ##健康检查,只有MASTER节点会有健康检查,所以SLAVE1和SLAVE2都不会有

[16:53:19 root@proxy ~]#tail -f /data/mastermha/app1/manager.log
Checking the Status of the script.. OK 
SIOCSIFADDR: No such device
eth0:1: ERROR while getting interface flags: No such device
SIOCSIFNETMASK: No such device
Sun Sep 18 16:55:27 2022 - [info]  OK.
Sun Sep 18 16:55:27 2022 - [warning] shutdown_script is not defined.
Sun Sep 18 16:55:27 2022 - [info] Set master ping interval 1 seconds.
Sun Sep 18 16:55:27 2022 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Sun Sep 18 16:55:27 2022 - [info] Starting ping health check on 10.0.0.132(10.0.0.132:3306)..
Sun Sep 18 16:55:27 2022 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

7.测试MHA的功能,模拟主节点down机

MHA进程为一次性的,加入down掉了,就要重新启动MHA进程

在主节点上执行存储过程,会自动同步到从节点。

之前在app1.cnf上面配置了candidate_master=1,现在128替换132称为主节点了

call sp_testlog;
select count(*) from testlog;

##停止MySQL服务
systemctl stop mysqld

##生成了一个manager.log的报告,fail的报告
tail -f /data/mastermha/app1/manager.log
----- Failover Report -----

app1: MySQL Master failover 10.0.0.132 to 10.0.0.128 succeeded
Master 10.0.0.132 is down!
Check MHA Manager logs at proxy:/data/mastermha/app1/manager.log for details.

mysql5.5主从集群配置 mysql主从和集群_mysql5.5主从集群配置_27

查看10.0.0.128的状态,已经切换到主了,检查数据同步的多少

如果在MASTER节点挂掉的情况下,you新写入的数据,则需要全库导出导入,备份恢复一下再加入到MySQL集群

##从节点状态没了
[hellodb]>show slave status \G;
Empty set, 1 warning (0.00 sec)

##在UP的从节点查看,已经切换成128了
[hellodb]>show slave status \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for source to send event
                  Master_Host: 10.0.0.128
                  Master_User: sync
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: slave-bin.000013
          Read_Master_Log_Pos: 35534754
               Relay_Log_File: slave1-relay-bin.000002
                Relay_Log_Pos: 326
        Relay_Master_Log_File: slave-bin.000013
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
            
##查看执行的存储过程一致性,count基本一致
[hellodb]>select count(*) from testlog;
+----------+
| count(*) |
+----------+
|   115030 |
+----------+

##新主节点查看进程,来自129的从节点
| 111 | sync            | 10.0.0.129:45244 | NULL    | Binlog Dump |  1413 | Source has sent all binlog to replica; waiting for more updates | NULL             |

##没改动,没更新的数据的情况下,可以直接change master;
[hellodb]>flush logs;
Query OK, 0 rows affected (0.00 sec)

[hellodb]>show master status;
+------------------+----------+--------------+------------------+-----------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                             |
+------------------+----------+--------------+------------------+-----------------------------------------------+
| slave-bin.000014 |      197 |              |                  | 5aa76b1b-1b77-11ed-9219-000c29333d85:1-115049 |
+------------------+----------+--------------+------------------+-----------------------------------------------+

##修改主节点
CHANGE MASTER TO
MASTER_HOST='10.0.0.128',
MASTER_USER='sync',
MASTER_PASSWORD='123',
MASTER_PORT=3306,
MASTER_LOG_FILE='slave-bin.000014', MASTER_LOG_POS=197;

start slave;

##节点状态是OK的
[(none)]>show slave status \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for source to send event
                  Master_Host: 10.0.0.128
                  Master_User: sync
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: slave-bin.000014
          Read_Master_Log_Pos: 197
               Relay_Log_File: rocky-relay-bin.000002
                Relay_Log_Pos: 326
        Relay_Master_Log_File: slave-bin.000014
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

TIDB:分布式的数据库

分布式表

MySQL的最大连接数:show variables like '%connection%'

最大并发连接数:151个

[(none)]>show variables like '%connection%';
+-----------------------------------+----------------------+
| Variable_name                     | Value                |
+-----------------------------------+----------------------+
| character_set_connection          | utf8mb4              |
| collation_connection              | utf8mb4_0900_ai_ci   |
| connection_memory_chunk_size      | 8912                 |
| connection_memory_limit           | 18446744073709551615 |
| global_connection_memory_limit    | 18446744073709551615 |
| global_connection_memory_tracking | OFF                  |
| max_connections                   | 151                  |
| max_user_connections              | 0                    |
| mysqlx_max_connections            | 100                  |
+-----------------------------------+----------------------+

mysqlslap:MySQL压测工具

测试用例:查看在1000并发的情况下,
mysqlslap -a -c 1000

[root@rocky ~]#mysqlslap -uroot -p123 -a -c 1000
mysqlslap: [Warning] Using a password on the command line interface can be insecure.
Benchmark
	Average number of seconds to run all queries: 17.270 seconds
	Minimum number of seconds to run all queries: 17.270 seconds
	Maximum number of seconds to run all queries: 17.270 seconds
	Number of clients running queries: 1000
	Average number of queries per client: 0

调整MySQL的最大并发连接数

set global max_connections=5000;

vim /etc/my.cnf
max_connections=5000

[(none)]>set global max_connections=5000;
Query OK, 0 rows affected (0.00 sec)

[(none)]>show databases;
+--------------------+
| Database           |
+--------------------+
| catyer             |
| hellodb            |
| information_schema |
| mysql              |
| mysqlslap          |
| performance_schema |
| sys                |
| trigger_pra        |
| wordpress          |
| zabbix             |
+--------------------+
10 rows in set (0.02 sec)