作者:任仲禹

爱可生 DBA 团队成员,擅长故障分析和性能优化,文章相关技术问题,欢迎大家一起讨论。



背景

源于某客户的需求,存在线上某业务 MySQL 库因为数据量及业务读写压力较大,需要将业务数据迁移到 DBLE 分布式数据库,但同时因为业务为 7x24h,能够停机的时间窗口较短,所以需要考虑数据实时同步的方案。

过往 DBLE 的业务上线基本为全新部署,数据实时同步的情况极少实施,去年 DTLE 发布后这一问题得到了些改善,今天我们来实践下。

环境准备

1. 目标端 DBLE 集群部署

  • 安装 DBLE 软件、后端分片 MySQL 库过程略
  • DBLE 版本 3.20.10.8、MySQL版本 5.7.25
  • sharding.xml
<?xml version="1.0"?>
<!DOCTYPE dble:sharding SYSTEM "sharding.dtd">
<dble:sharding xmlns:dble="http://dble.cloud/" version="4.0">
<schema name="dtle" sqlMaxLimit="-1" shardingNode="dn_01">
<singleTable name="gtid_executed_v4" shardingNode="dn_01" sqlMaxLimit="-1"></singleTable>
</schema>
<schema name="ren" sqlMaxLimit="-1" shardingNode="dn_01">
<shardingTable name="test" shardingNode="dn_01,dn_02,dn_03,dn_04" sqlMaxLimit="-1" shardingColumn="id" function="func_jumphash"></shardingTable>
</schema>
<shardingNode name="dn_03" dbGroup="dh-mysql-cluster02" database="dh_dn_03"></shardingNode>
<shardingNode name="dn_04" dbGroup="dh-mysql-cluster02" database="dh_dn_04"></shardingNode>
<shardingNode name="dn_02" dbGroup="dh-mysql-cluster01" database="dh_dn_02"></shardingNode>
<shardingNode name="dn_01" dbGroup="dh-mysql-cluster01" database="dh_dn_01"></shardingNode>
<function name="func_jumphash" class="jumpStringHash">
<property name="partitionCount">4</property>
<property name="hashSlice">0:-1</property>
</function>
</dble:sharding>
  • db.xml
<?xml version="1.0"?>
<!DOCTYPE dble:db SYSTEM "db.dtd">
<dble:db xmlns:dble="http://dble.cloud/" version="4.0">
<dbGroup name="dh-mysql-cluster02" rwSplitMode="0" delayThreshold="-1">
<heartbeat timeout="0" errorRetryCount="0">show slave status</heartbeat>
<dbInstance name="10.186.61.13-3326-dh-1" url="10.186.61.13:3326" user="dbleuser" password="jpfmxIeMt1vxAJ6zd6Q10PGRRi+Qj023Dl+YXuOr3C4VXTdV5+GJaOIv5iVmWCwpXcucn/zi02HVlT7ADX+m6Q==" maxCon="100" minCon="10" primary="true" readWeight="0" id="mysql-i63009" usingDecrypt="true"></dbInstance>
</dbGroup>
<dbGroup name="dh-mysql-cluster01" rwSplitMode="0" delayThreshold="-1">
<heartbeat timeout="0" errorRetryCount="0">show slave status</heartbeat>
<dbInstance name="10.186.61.11-3316-dh-1" url="10.186.61.11:3316" user="dbleuser" password="QQWRF80AGNbx4jIAx/b2Ww7Myol1+ntlyzGmA1A3PXVISmRD/i5pgRnLLwYsXoLmH0jiv1qZAkqIBHv6Yg/XAg==" maxCon="100" minCon="10" primary="true" readWeight="0" id="mysql-47vn84" usingDecrypt="true"></dbInstance>
</dbGroup>
</dble:db>
  • user.xml
<?xml version="1.0"?>
<!DOCTYPE dble:user SYSTEM "user.dtd">
<dble:user xmlns:dble="http://dble.cloud/" version="4.0">
<managerUser name="root" password="CrjpLhvVJkHk0EPW35Y07dUeTimf52zMqClYQkIAN3/dqiG1DVUe9Zr4JLh8Kl+1KH1zd7YTKu5w04QgdyQeDw==" usingDecrypt="true"></managerUser>
<shardingUser name="ren" schemas="ren,dtle" password="P+C2KazQiS3ZZ6uojBJ91MZIqYqGczspQ/ebyBZOC9xKAAkAFrqEDC9OPn/vObAyO4P8Zu3vHQJ+rljM040Kdg==" usingDecrypt="true" readOnly="false" maxCon="0" blacklist="default_black_list"></shardingUser>
</dble:user>

2. 源端和目标端测试表创建

  • 源端 MySQL 数据库软件安装略
  • 源端MySQL与目标端DBLE都需要创建测试表名:test
use ren;
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(20) COLLATE utf8mb4_bin DEFAULT NULL,
`city` varchar(20) COLLATE utf8mb4_bin DEFAULT NULL,
`dt` datetime DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `idx_ctiy` (`city`)
) ENGINE=InnoDB;

3. 部署单节点DTLE

  • DTLE社区版本GitHub下载地址:https://github.com/actiontech/dtle/
  • 下载完成后直接rpm安装(本示例使用内部QA验证版本)
  • rpm -ivh --prefix=/data/dtle dtle-ee-9.9.9.9.x86_64.rpm
  • 安装完成确认启动正常
# curl -XGET "127.0.0.1:4646/v1/nodes" -s | jq
[
{
"Address": "127.0.0.1",
"ID": "223c31b4-05cd-a763-b3e7-dbea6d416576",
"Datacenter": "dc1",
"Name": "nomad0",
"NodeClass": "",
"Version": "1.1.2",
"Drain": false,
"SchedulingEligibility": "eligible",
"Status": "ready",
"StatusDescription": "",
"Drivers": {
"dtle": {
"Attributes": {
"driver.dtle.full_version": "9.9.9.9-master-a65ee13",
"driver.dtle": "true",
"driver.dtle.version": "9.9.9.9"
},
"Detected": true,
"Healthy": true,
"HealthDescription": "Healthy",
"UpdateTime": "2022-02-28T07:45:15.650289984Z"
·········
]

创建 MySQL-To-DBLE 任务

一、全量同步

1. 准备job文件
# cat job.json
{
"Job": {
"ID": "mysqlToDBLE",
"Datacenters": ["dc1"],
"TaskGroups": [{
"Name": "src",
"Tasks": [{
"Name": "src",
"Driver": "dtle",
"Config": {
"Gtid": "",
"ReplicateDoDb": [{
"TableSchema": "ren",
"Tables": [{
"TableName": "test"
}]
}],
"ConnectionConfig": {
"Host": "10.186.61.11",
"Port": 3306,
"User": "root",
"Password": "root"
}
}
}]
}, {
"Name": "dest",
"Tasks": [{
"Name": "dest",
"Driver": "dtle",
"Config": {
"ConnectionConfig": {
"Host": "10.186.61.10",
"Port": 8066,
"User": "ren",
"Password": "ren"
}
}
}]
}]
}
}
2. 准备全量复制数据
  • 源端 MySQL库执行
mysql> insert into test values(1,'ren','sh',now());
mysql> insert into test values(2,'jack','bj',now());
mysql> insert into test values(3,'tom','sz',now());
3. 启动同步任务
# curl -XPOST "http://127.0.0.1:4646/v1/jobs" -d @job.json -s| jq
{
"EvalID": "88ab4a42-98b7-696e-0f98-08c1fe3ee4bd",
"EvalCreateIndex": 12310,
"JobModifyIndex": 12310,
"Warnings": "",
"Index": 12310,
"LastContact": 0,
"KnownLeader": false
}
4. 检查同步情况
  • 确认全量数据同步完成
# 目标端 DBLE 中执行
mysql> use ren;
Database changed
mysql> show tables;
+------------------+
| Tables_in_ren |
+------------------+
| test |
| gtid_executed_v4 |
+------------------+
2 rows in set (0.01 sec)

mysql> select * from test;
+----+------+------+---------------------+
| id | name | city | dt |
+----+------+------+---------------------+
| 1 | ren | sh | 2022-03-07 06:53:30 |
| 2 | jack | bj | 2022-03-07 06:53:41 |
| 3 | tom | sz | 2022-03-07 06:53:59 |
+----+------+------+---------------------+
3 rows in set (0.01 sec)

# 源端 MySQL 写入增量测试数据
mysql> insert into test select null,'mike','nj',now();
Query OK, 1 row affected (0.01 sec)
Records: 1 Duplicates: 0 Warnings: 0
········
mysql> insert into test select null,'mike4','nj',now();
Query OK, 1 row affected (0.01 sec)
Records: 1 Duplicates: 0 Warnings: 0

mysql> update test set city = 'sh' where name like 'mike%';
Query OK, 4 rows affected (0.01 sec)
Rows matched: 4 Changed: 4 Warnings: 0

mysql> select * from test;
+----+-------+------+---------------------+
| id | name | city | dt |
+----+-------+------+---------------------+
| 1 | ren | sh | 2022-03-07 06:53:30 |
| 2 | jack | bj | 2022-03-07 06:53:41 |
| 3 | tom | sz | 2022-03-07 06:53:59 |
| 45 | mike | sh | 2022-03-07 08:03:57 |
| 46 | mike2 | sh | 2022-03-07 08:04:02 |
| 47 | mike3 | sh | 2022-03-07 08:04:05 |
| 48 | mike4 | sh | 2022-03-07 08:04:09 |
+----+-------+------+---------------------+
7 rows in set (0.01 sec)

# 目标端 DBLE 检查增量同步情况
mysql> select * from test;
+----+-------+------+---------------------+
| id | name | city | dt |
+----+-------+------+---------------------+
| 1 | ren | sh | 2022-03-07 06:53:30 |
| 2 | jack | bj | 2022-03-07 06:53:41 |
| 3 | tom | sz | 2022-03-07 06:53:59 |
| 45 | mike | sh | 2022-03-07 08:03:57 |
| 46 | mike2 | sh | 2022-03-07 08:04:02 |
| 47 | mike3 | sh | 2022-03-07 08:04:05 |
| 48 | mike4 | sh | 2022-03-07 08:04:09 |
+----+-------+------+---------------------+
7 rows in set (0.04 sec)

mysql> explain select * from test where id = 1;
+---------------+----------+----------------------------------+
| SHARDING_NODE | TYPE | SQL/REF |
+---------------+----------+----------------------------------+
| dn_01 | BASE SQL | select * from test where id = 1 |
+---------------+----------+----------------------------------+
1 row in set (0.03 sec)

二、基于GTID位点增量同步

1. 销毁全量同步任务
# cd /data/dtle/usr/bin/
# ll
total 188836
-rwxr-xr-x 1 root root 107811060 Mar 17 2020 consul
-rwxr-xr-x 1 root root 85550512 Jun 22 2021 nomad
# ./nomad job status
ID Type Priority Status Submit Date
mysqlToDBLE service 50 running 2022-03-07T15:47:31+08:00
mysqltoMysql-sync service 50 running 2022-03-03T16:06:10+08:00
# ./nomad job stop -purge mysqlToDBLE
·······
⠙ Deployment "433ed3d4" successful
·······
# ./nomad job status
ID Type Priority Status Submit Date
mysqltoMysql-sync service 50 running 2022-03-03T16:06:10+08:00
2. 记录源端GTID位点
# 记录源端 MySQL 需要开始的GTID位点
mysql> show master status\G
*************************** 1. row ***************************
File: mysql-bin.000178
········
Executed_Gtid_Set: 442dbe92-00c3-11ec-a0cf-02000aba3d0b:1-49705119,
cdc6fb62-00c2-11ec-a259-02000aba3d0a:1-3555
1 row in set (0.01 sec)
# 插入增量数据(模拟业务新增数据)
mysql> insert into test select 88,'sync01','wh',now();
mysql> insert into test select 99,'sync02','wh',now();
# 源端 MySQL 确认数据已插入
mysql> select * from test;
+----+--------+------+---------------------+
| id | name | city | dt |
+----+--------+------+---------------------+
| 1 | ren | sh | 2022-03-07 06:53:30 |
········
| 48 | mike4 | sh | 2022-03-07 08:04:09 |
| 88 | sync01 | wh | 2022-03-07 08:24:20 |
| 99 | sync02 | wh | 2022-03-07 08:24:31 |
+----+--------+------+---------------------+
9 rows in set (0.00 sec)
# 目标端 DBLE 数据因同步job已销毁,新插入数据未同步过来
mysql> select * from test;
+----+-------+------+---------------------+
| id | name | city | dt |
+----+-------+------+---------------------+
| 1 | ren | sh | 2022-03-07 06:53:30 |
········
| 48 | mike4 | sh | 2022-03-07 08:04:09 |
+----+-------+------+---------------------+
7 rows in set (0.00 sec)
3. 准备增量同步job文件
# cat job.json
{
"Job": {
"ID": "mysqlToDBLE",
"Datacenters": ["dc1"],
"TaskGroups": [{
"Name": "src",
"Tasks": [{
"Name": "src",
"Driver": "dtle",
"Config": {
"Gtid": "442dbe92-00c3-11ec-a0cf-02000aba3d0b:1-49705119,cdc6fb62-00c2-11ec-a259-02000aba3d0a:1-3555",
"ReplicateDoDb": [{
"TableSchema": "ren",
"Tables": [{
"TableName": "test"
}]
}],
"ConnectionConfig": {
"Host": "10.186.61.11",
"Port": 3306,
"User": "root",
"Password": "root"
}
}
}]
}, {
"Name": "dest",
"Tasks": [{
"Name": "dest",
"Driver": "dtle",
"Config": {
"ConnectionConfig": {
"Host": "10.186.61.10",
"Port": 8066,
"User": "ren",
"Password": "ren"
}
}
}]
}]
}
}
4. 开始增量同步任务
# curl -XPOST "http://127.0.0.1:4646/v1/jobs" -d @job.json -s |jq
{
"EvalID": "cad6fb19-62d3-67aa-6f5c-fbb79f8016d2",
"EvalCreateIndex": 12855,
"JobModifyIndex": 12855,
"Warnings": "",
"Index": 12855,
"LastContact": 0,
"KnownLeader": false
}
5. 检查同步情况
# 目标端 DBLE 中查看到 GTID 位点之后的数据已同步过来
mysql> select * from test;
+-----+--------+------+---------------------+
| id | name | city | dt |
+-----+--------+------+---------------------+
| 1 | ren | sh | 2022-03-07 06:53:30 |
| 48 | mike4 | sh | 2022-03-07 08:04:09 |
·········
| 88 | sync01 | wh | 2022-03-07 08:24:20 |
| 99 | sync02 | wh | 2022-03-07 08:24:31 |
+-----+--------+------+---------------------+
11 rows in set (0.06 sec)
6. 其它DML及DDL同步
  • 验证下其它update、delete语句及DDL语句同步情况
# 源端 MySQL 执行操作
mysql> delete from test where id >= 100;
Query OK, 2 rows affected (0.01 sec)
mysql> delete from test where id > 3;
Query OK, 6 rows affected (0.01 sec)
mysql> update test set name = 'actionsky' where id = 3;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

# 目标端 DBLE 检查同步情况
mysql> select * from test;
+----+-----------+------+---------------------+
| id | name | city | dt |
+----+-----------+------+---------------------+
| 1 | ren | sh | 2022-03-07 06:53:30 |
| 2 | jack | bj | 2022-03-07 06:53:41 |
| 3 | actionsky | sz | 2022-03-07 06:53:59 |
+----+-----------+------+---------------------+
3 rows in set (0.01 sec)

# 源端 MySQL 执行 DDL操作
mysql> alter table test add column info varchar(20) default 'hello';
mysql> update test set info = 'thanks' where id = 3;
mysql> alter table test add index idx_info(`info`);

# 目标端 DBLE 可以进行DDL同步(篇幅所限,实际上DBLE兼容的DDL语句都能同步成功)
mysql> select * from test;
+----+-----------+------+---------------------+--------+
| id | name | city | dt | info |
+----+-----------+------+---------------------+--------+
| 1 | ren | sh | 2022-03-07 06:53:30 | hello |
| 2 | jack | bj | 2022-03-07 06:53:41 | hello |
| 3 | actionsky | sz | 2022-03-07 06:53:59 | thanks |
+----+-----------+------+---------------------+--------+
3 rows in set (0.02 sec)

mysql> show create table test\G
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(20) COLLATE utf8mb4_bin DEFAULT NULL,
`city` varchar(20) COLLATE utf8mb4_bin DEFAULT NULL,
`dt` datetime DEFAULT CURRENT_TIMESTAMP,
`info` varchar(20) COLLATE utf8mb4_bin DEFAULT 'hello',
PRIMARY KEY (`id`),
KEY `idx_ctiy` (`city`),
KEY `idx_info` (`info`)
) ENGINE=InnoDB AUTO_INCREMENT=89 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.01 sec)

配置 MySQL-To-DBLE 注意事项

1. 检测连接失败问题

  • 问题描述:DTLE配置JOB过程中,“检测连接”失败,nomad日志报错ERROR 1064 (HY000): java.sql.SQLSyntaxErrorException: illegal value[TRUE]
  • 原因:DTLE下发的检测客户端语句set autocommit=true,在DBLE某些版本中不支持
  • 解决:升级DBLE到3.20.10.6版本及之后

2. 任务启动后同步失败报’dtle’不存在

  • 问题描述:DTLE同步任务启动后报错, nomad 日志出现Can’t create database ‘dtle’ that doesn’t exists.
  • 原因:
  • DTLE To MySQL ,不会出现该种报错
  • DTLE To DBLE ,由于DBLE中间件中schema的创建方式与普通MySQL不一致,所以该create语法不支持
  • 解决:
  • 需要对DBLE进行额外的 Schema/Table 配置,参考前文 sharding.xml 和 user.xml中相关配置

3. 任务启动后同步失败报’Data too long’

  • 问题描述:DTLE同步任务启动后报错,nomad日志出现“applier error/restart: insert gno: Error 1406: Data too long for column ‘source_uuid’ at row 1”
  • 原因:
  • DTLE在DBLE中创建的表gtid_executed_v4 中,字段source_uuid的Binary数据类型长度不够
  • 也可通过排查DBLE中间件日志(core/log/dble.log),报错信息为“execute sql err : errNo:1406 Data too long for column ‘source_uuid’ at row 1”
  • 解决:
  • DBLE中,修改字段
  • alter table gtid_executed_v4 modify column source_uuid binary(60);

结论

  • DTLE 目前功能基本可以满足 MySQL -> DBLE 间数据实施同步需求,不过需要注意的是,不建议采用本文所提到的 全量同步 方式
  • 生产环境实施由于 MySQL 老库数据量较大,可以先将数据全量逻辑备份出来(需记录GTID位点),再通过 DBLE 自带的 split 工具进行拆分后进行导入,然后再使用 DTLE 基于GTID位点增量同步 的方式进行数据同步
  • DTLE 创建 To-DBLE 任务前需要关注下前文所示注意事项,尤其是提前准备好 sharding.xml、user.xml文件并创建好 DTLE的元数据表gtid_executed_v4。